PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.18k stars 5.57k forks source link

Check failed: posix_memalign(&ptr, 4096ul, size) == 0 (12 vs. 0) #11082

Closed Adagch closed 6 years ago

Adagch commented 6 years ago

在运行到 print type(parameters)结束后,就出现长时间的卡机状态,然后就出现Check failed: posix_memalign(&ptr, 4096ul, size) == 0 (12 vs. 0)错误,结果如下,但是如果没有后面的feeding及其之后的内容,运行的话也是同样的错误。 I0531 16:48:15.966539 2922 Util.cpp:166] commandline: --use_gpu=False W0531 16:48:15.966620 2922 CpuId.h:112] PaddlePaddle wasn't compiled to use avx instructions, but these are available on your machine and could speed up CPU computations via CMAKE .. -DWITH_AVX=ON <class 'paddle.trainer_config_helpers.layers.LayerOutput'> <paddle.trainer_config_helpers.layers.LayerOutput object at 0x7f2d4d0ee110> <class 'paddle.trainer_config_helpers.layers.LayerOutput'> <paddle.trainer_config_helpers.layers.LayerOutput object at 0x7f2d4ce83210> <class 'paddle.trainer_config_helpers.layers.LayerOutput'> <class 'paddle.trainer_config_helpers.layers.LayerOutput'> <class 'paddle.v2.parameters.Parameters'> F0531 16:50:16.227624 2922 Allocator.h:54] Check failed: posix_memalign(&ptr, 4096ul, size) == 0 (12 vs. 0) Check failure stack trace: @ 0x7f2d5765b3dd google::LogMessage::Fail() @ 0x7f2d5765ee8c google::LogMessage::SendToLog() @ 0x7f2d5765af03 google::LogMessage::Flush() @ 0x7f2d5766039e google::LogMessageFatal::~LogMessageFatal() @ 0x7f2d575aff38 paddle::CpuAllocator::alloc() @ 0x7f2d575aa82f paddle::PoolAllocator::alloc() @ 0x7f2d575aa316 paddle::CpuMemoryHandle::CpuMemoryHandle() @ 0x7f2d575b5dbe paddle::CpuVectorT<>::CpuVectorT() @ 0x7f2d575b692a paddle::VectorT<>::create() @ 0x7f2d575b6a49 paddle::VectorT<>::createParallelVector() @ 0x7f2d572f9c26 paddle::Parameter::enableType() @ 0x7f2d572f7d42 paddle::NeuralNetwork::init() @ 0x7f2d57314691 paddle::GradientMachine::create() @ 0x7f2d576383d3 GradientMachine::createFromPaddleModelPtr() @ 0x7f2d576385af GradientMachine::createByConfigProtoStr() @ 0x7f2d572b700d _wrap_GradientMachine_createByConfigProtoStr @ 0x4c30ce PyEval_EvalFrameEx @ 0x4b9ab6 PyEval_EvalCodeEx @ 0x4c1e6f PyEval_EvalFrameEx @ 0x4b9ab6 PyEval_EvalCodeEx @ 0x4c16e7 PyEval_EvalFrameEx @ 0x4b9ab6 PyEval_EvalCodeEx @ 0x4d55f3 (unknown) @ 0x4eebee (unknown) @ 0x4ee7f6 (unknown) @ 0x4aa9ab (unknown) @ 0x4c15bf PyEval_EvalFrameEx @ 0x4c136f PyEval_EvalFrameEx @ 0x4b9ab6 PyEval_EvalCodeEx @ 0x4eb30f (unknown) @ 0x4e5422 PyRun_FileExFlags @ 0x4e3cd6 PyRun_SimpleFileExFlags 已放弃 (核心已转储) 下面是代码:

coding=utf-8

import paddle.v2 as paddle import cPickle import copy import os import numpy

def main(): paddle.init(use_gpu=False)

用户id

uid = paddle.layer.data(
    name='user_id',
    type=paddle.data_type.integer_value(82542756))#max
#print uid
usr_emb = paddle.layer.embedding(input=uid, size=8)
usr_fc = paddle.layer.fc(input=usr_emb, size=8)

#年龄id
usr_age_id = paddle.layer.data(
    name='age_id',
    type=paddle.data_type.integer_value(6))#max
usr_age_emb = paddle.layer.embedding(input=usr_age_id, size=8)
usr_age_fc = paddle.layer.fc(input=usr_age_emb, size=8)

# 性别id
usr_gender_id = paddle.layer.data(
    name='gender_id',
    type=paddle.data_type.integer_value(3))#max
usr_gender_emb = paddle.layer.embedding(input=usr_gender_id, size=8)
usr_gender_fc = paddle.layer.fc(input=usr_gender_emb, size=8)

#婚姻情况
usr_marry_id = paddle.layer.data(
    name = 'marry_id',
    type = paddle.data_type.integer_value(16))#max
usr_marry_emb = paddle.layer.embedding(input=usr_marry_id, size=8)
usr_marry_fc = paddle.layer.fc(input=usr_marry_emb, size=8)

#学历
usr_education_id = paddle.layer.data(
    name='education_id',
    type=paddle.data_type.integer_value(8))#max
usr_education_emb = paddle.layer.embedding(input=usr_education_id, size=8)
usr_education_fc = paddle.layer.fc(input=usr_education_emb, size=8)

#消费能力
usr_consume_id = paddle.layer.data(
    name='consume_id',
    type=paddle.data_type.integer_value(3))#max
usr_consume_emb = paddle.layer.embedding(input=usr_consume_id, size=8)
usr_consume_fc = paddle.layer.fc(input=usr_consume_emb, size=8)

#地理位置
usr_lbs_id = paddle.layer.data(
    name='lbs_id',
    type=paddle.data_type.integer_value(998))#max
usr_lbs_emb = paddle.layer.embedding(input=usr_lbs_id, size=8)
usr_lbs_fc = paddle.layer.fc(input=usr_lbs_emb, size=8)

#广告id
ad_id = paddle.layer.data(
    name='ad_id',
    type=paddle.data_type.integer_value(2217))#id最大为2216
ad_emb = paddle.layer.embedding(input=ad_id, size=8)
ad_fc = paddle.layer.fc(input=ad_emb, size=8)

#广告advertiserId

advertiser_id = paddle.layer.data(
    name='advertiser_id',
    type=paddle.data_type.integer_value(158680))  # id最大为158679
advertiser_id_emb = paddle.layer.embedding(input=advertiser_id, size=8)
advertiser_id_fc = paddle.layer.fc(input=advertiser_id_emb, size=8)

#广告campaignId
campaign_id = paddle.layer.data(
    name='campaignId',
    type=paddle.data_type.integer_value(766461))#max
campaign_id_emb = paddle.layer.embedding(input=campaign_id, size=8)
campaign_id_fc = paddle.layer.fc(input=campaign_id_emb, size=8)

#广告creativeId
creative_id = paddle.layer.data(
    name='creativeId',
    type=paddle.data_type.integer_value(1806761))#max
creative_id_emb = paddle.layer.embedding(input=creative_id, size=8)
creative_id_fc = paddle.layer.fc(input=creative_id_emb, size=8)
# 广告creativeSize
creative_size_id = paddle.layer.data(
    name='creativeSize',
    type=paddle.data_type.integer_value(110))#max
creative_size_emb = paddle.layer.embedding(input=creative_size_id, size=8)
creative_size_fc = paddle.layer.fc(input=creative_size_emb, size=8)
#广告adCategoryId
ad_category_id = paddle.layer.data(
    name='adCategoryId',
    type=paddle.data_type.integer_value(283))#max
ad_category_id_emb = paddle.layer.embedding(input=ad_category_id, size=8)
ad_category_id_fc = paddle.layer.fc(input=ad_category_id_emb, size=8)
#广告productId
product_id = paddle.layer.data(
    name='productId',
    type=paddle.data_type.integer_value(28987))#max
product_id_emb = paddle.layer.embedding(input=product_id, size=8)
product_id_fc = paddle.layer.fc(input=product_id_emb, size=8)
#广告productType
product_type = paddle.layer.data(
    name='productType',
    type=paddle.data_type.integer_value(12))#max
product_type_emb = paddle.layer.embedding(input=product_type, size=8)
product_type_fc = paddle.layer.fc(input=product_type_emb, size=8)

usr_combined_features = paddle.layer.fc(
    input=[usr_fc, usr_age_fc, usr_gender_fc, usr_marry_fc, usr_education_fc, usr_consume_fc, usr_lbs_fc],
    size=50,
    act=paddle.activation.Tanh())
print type(usr_combined_features)
print usr_combined_features
ad_combined_features = paddle.layer.fc(
    input=[ad_fc, advertiser_id_fc, campaign_id_fc,creative_id_fc,creative_size_fc,ad_category_id_fc,product_id_fc,product_type_fc],
    size=50,
    act=paddle.activation.Tanh())
print type(ad_combined_features)
print ad_combined_features
inference = paddle.layer.cos_sim(
    a=usr_combined_features, b=ad_combined_features, size=1, scale=5)
print type(inference)
cost = paddle.layer.square_error_cost(
    input=inference,
    label=paddle.layer.data(name='label', type=paddle.data_type.integer_value(2)))
print type(cost)
parameters = paddle.parameters.create(cost)
print type(parameters)
trainer = paddle.trainer.SGD(
    cost=cost,
    parameters=parameters,
    update_equation=paddle.optimizer.Adam(learning_rate=1e-4))
print trainer
print type(trainer)
feeding = {
    'label': 0,
    'ad_id': 1,
    'advertiser_id': 2,
    'campaignId': 3,
    'creativeId': 4,
    'creativeSize': 5,
    'adCategoryId': 6,
    'productId': 7,
    'productType': 8,
    'user_id': 9,
    'age_id': 10,
    'gender_id': 11,
    'marry_id': 12,
    'education_id': 13,
    'consume_id': 14,
    'lbs_id': 15
}

def event_handler(event):
    if isinstance(event, paddle.event.EndIteration):
        if event.batch_id % 100 == 0:
            print "Pass %d Batch %d Cost %.2f" % (
                event.pass_id, event.batch_id, event.cost)

# 新添加的,修改部分
def train_reader():
    train_1 = numpy.loadtxt('t2_1.txt', delimiter=',')
    def reader():
        for i in xrange(len(train_1)):
            yield train_1[i]

    return reader()

trainer.train(
    reader=paddle.batch(
        paddle.reader.shuffle(
            train_reader, buf_size=8192),
        batch_size=256),
    event_handler=event_handler,
    feeding=feeding,
    num_passes=1)

if name == 'main': main() 文件t2_1.txt内容如下:(100条数据) 1,748,8203,37818,202309,59,142,0,6,3467967,4,1,15,7,1,244 1,1119,3993,63752,798752,59,10,19256,11,63515159,3,1,11,6,1,192 1,1566,6946,296367,520004,59,24,3794,11,29244339,3,1,13,2,2,592 1,1201,5552,68476,1172593,35,27,113,9,52296860,5,2,12,6,1,72 1,311,915,994,27461,60,51,0,4,15851097,1,1,10,2,0,810 1,765,388,134068,1271219,35,27,113,9,74834157,1,2,10,6,2,446 1,2196,20943,445098,767513,53,4,0,4,8762244,4,2,10,2,1,432 1,1512,702,20048,1246897,42,34,5615,11,68672800,4,1,11,2,1,152 1,1291,1082,40405,1434096,53,13,0,6,50286402,1,1,10,7,1,115 1,725,370,170485,1485462,22,67,113,9,35500121,2,2,10,2,0,687 1,1119,3993,63752,798752,59,10,19256,11,58854550,3,1,11,1,1,75 1,70,327,5616,5977,22,27,113,9,65402229,2,2,10,1,2,246 1,692,6946,296367,455396,59,24,3794,11,68071308,3,1,0,1,2,188 1,70,327,5616,5977,22,27,113,9,71317618,2,2,13,6,1,986 1,411,9106,163120,220179,79,21,0,4,24145001,2,1,13,2,1,112 1,692,6946,296367,455396,59,24,3794,11,26445217,1,1,10,7,1,348 1,916,17597,51385,838056,35,25,0,6,52724599,1,2,10,6,2,317 1,231,11487,159012,991964,20,1,0,4,61919288,5,2,0,3,0,83 1,117,702,18552,619519,53,24,5615,11,1950387,3,1,10,1,0,83 1,2050,19441,178687,245165,53,1,0,6,6194240,3,1,0,2,0,435 1,1415,133292,464828,1334609,22,74,0,4,43057817,4,1,11,2,1,816 1,411,9106,163120,220179,79,21,0,4,38047467,5,2,10,7,1,348 1,1415,133292,464828,1334609,22,74,0,4,39110333,5,1,11,6,1,458 1,70,327,5616,5977,22,27,113,9,39721813,2,2,10,2,2,246 1,1415,133292,464828,1334609,22,74,0,4,51834299,4,1,11,6,1,338 1,1119,3993,63752,798752,59,10,19256,11,21468794,3,1,10,2,1,361 1,2013,6937,186348,1427984,35,89,3791,9,11037442,2,1,10,2,1,458 1,914,47823,111645,141973,100,21,0,4,812262,4,1,11,7,1,329 1,765,388,134068,1271219,35,27,113,9,48244145,2,2,10,2,1,94 1,136,452,50305,1187573,35,10,7992,11,47172393,4,1,11,2,2,687 1,1119,3993,63752,798752,59,10,19256,11,43562595,3,1,11,1,2,353 1,369,66025,170445,1229175,109,94,0,4,50891107,5,1,11,7,1,112 1,1023,8494,12711,192305,42,24,4666,11,48487376,4,1,11,1,0,271 1,206,13915,23303,440096,91,108,0,4,23173916,1,1,13,7,1,360 1,543,1082,295940,1391569,59,22,0,6,24920509,1,2,13,7,1,115 1,1291,1082,40405,1434096,53,13,0,6,25555458,1,1,13,7,1,458 1,975,8494,76011,913588,59,24,4666,11,63744251,2,1,0,7,1,809 1,411,9106,163120,220179,79,21,0,4,34978201,2,1,10,6,1,783 1,1827,17597,51385,1236432,35,27,0,6,42198245,5,2,6,1,2,839 1,411,9106,163120,220179,79,21,0,4,5567656,5,1,10,7,1,857 1,404,821,888,1353465,59,10,439,11,79114026,2,1,10,2,0,803 1,1468,915,994,1610899,60,51,0,4,47258910,2,1,10,2,1,592 1,1119,3993,63752,798752,59,10,19256,11,18954714,3,1,10,6,0,514 1,2118,11195,19215,1012717,53,140,0,4,45501967,2,1,11,7,1,192 1,1749,21359,361928,585909,100,21,0,4,73670587,5,1,11,1,0,112 1,1119,3993,63752,798752,59,10,19256,11,53203349,3,1,0,2,1,524 1,1021,388,243160,1249596,22,27,113,9,33579047,5,2,13,7,1,464 1,725,370,170485,1485462,22,67,113,9,80887595,4,2,11,2,1,585 1,914,47823,111645,141973,100,21,0,4,44336434,4,1,11,7,1,737 1,1415,133292,464828,1334609,22,74,0,4,16946731,4,1,11,7,1,18 1,2048,8203,37818,240336,59,142,0,6,39627326,4,2,11,7,1,27 1,174,11487,668182,1512679,22,21,0,4,27171171,3,1,11,3,1,833 1,311,915,994,27461,60,51,0,4,53203989,1,1,10,1,2,56 1,914,47823,111645,141973,100,21,0,4,52466809,2,1,5,6,1,652 1,1931,7300,36763,1401261,79,21,0,4,24594565,5,1,13,7,1,458 1,1918,158679,643438,1690612,60,4,0,4,22757215,5,1,11,2,2,879 1,562,21017,167166,864509,109,179,0,4,20318516,1,2,10,7,2,792 1,692,6946,296367,455396,59,24,3794,11,52429638,3,1,0,2,1,859 1,1468,915,994,1610899,60,51,0,4,47128987,1,1,10,2,1,375 1,1918,158679,643438,1690612,60,4,0,4,69214066,4,1,11,7,0,514 1,1468,915,994,1610899,60,51,0,4,70426873,2,1,13,7,2,27 1,1790,8494,12711,1636465,42,24,4666,11,41055704,4,1,11,1,1,112 1,765,388,134068,1271219,35,27,113,9,47613506,1,2,10,0,2,194 1,389,9106,662422,1354071,79,21,0,4,53672317,2,1,10,2,0,932 1,311,915,994,27461,60,51,0,4,45284219,1,1,10,1,1,121 1,1338,702,12724,1147463,105,10,4669,11,44449150,3,1,13,2,1,267 1,1350,7565,353610,1554384,109,94,0,4,39622577,1,1,13,7,1,346 1,1918,158679,643438,1690612,60,4,0,4,80390066,4,1,11,2,2,29 1,18,8203,74452,857791,95,218,0,6,49571126,1,2,10,7,2,737 1,173,6937,186348,267290,35,89,3791,9,50677511,2,1,0,7,1,864 1,1291,1082,40405,1434096,53,13,0,6,54132009,1,1,13,7,1,555 1,2205,370,286844,1149439,22,67,113,9,26627959,1,2,13,6,2,270 1,792,8350,331396,1564743,22,59,0,4,73597952,1,2,13,2,2,921 1,70,327,5616,5977,22,27,113,9,16338888,2,2,10,2,2,432 1,916,17597,51385,838056,35,25,0,6,16399840,2,2,13,2,1,346 1,70,327,5616,5977,22,27,113,9,63966327,2,2,10,3,2,45 1,411,9106,163120,220179,79,21,0,4,48662777,2,1,11,6,1,273 1,692,6946,296367,455396,59,24,3794,11,12380611,1,1,10,7,0,774 1,1379,8864,90700,469197,22,27,113,9,69573669,2,1,10,2,1,209 1,2050,19441,178687,245165,53,1,0,6,30891193,5,1,11,7,1,441 1,1596,24704,48236,181137,35,27,113,9,32263276,2,2,0,7,2,244 1,302,18621,745599,1628574,91,21,0,4,67276726,3,2,11,2,0,856 1,914,47823,111645,141973,100,21,0,4,32767857,4,1,11,1,1,421 1,1950,41806,233191,1016027,35,13,27855,9,17426513,2,1,10,6,1,737 1,404,821,888,1353465,59,10,439,11,18170127,4,1,15,2,0,0 1,2118,11195,19215,1012717,53,140,0,4,47783650,1,1,10,2,1,687 1,1596,24704,48236,181137,35,27,113,9,20659211,2,2,10,1,2,346 1,1483,1082,40405,418462,53,13,0,6,38444025,2,1,13,6,1,486 1,1291,1082,40405,1434096,53,13,0,6,58341820,2,1,10,2,1,964 1,792,8350,331396,1564743,22,59,0,4,55466074,1,2,12,7,1,964 1,302,18621,745599,1628574,91,21,0,4,77134272,3,2,13,2,1,86 1,1377,388,209098,1146648,35,27,113,9,28379936,5,2,13,2,2,61 1,692,6946,296367,455396,59,24,3794,11,25584257,2,1,10,2,1,833 1,792,8350,331396,1564743,22,59,0,4,71750264,1,2,10,7,0,678 1,1017,11487,741453,1614385,22,21,0,4,13473925,4,1,11,1,1,170 1,1140,43189,98158,1305307,22,43,28986,9,2974662,5,1,10,7,2,94 1,1254,8350,244601,1383456,35,59,0,4,31785917,5,2,6,6,1,585 1,191,25485,50138,58465,35,51,15454,11,24176325,2,1,10,7,0,774 1,692,6946,296367,455396,59,24,3794,11,26473709,1,1,13,5,1,458 1,1291,1082,40405,1434096,53,13,0,6,47137371,1,2,10,7,1,921

gongweibao commented 6 years ago

http://www-numi.fnal.gov/offline_software/srt_public_context/WebDocs/Errors/unix_system_errors.html

#define ENOMEM          12      /* Out of memory */

没有内存了?

Adagch commented 6 years ago

@gongweibao 可是那个数据才100条,也没有很大啊,给他8G内存,4个处理器都还是这样报错,而且,我把feeding和它后面的内容都删了,也还是这个错误,此时都没有读文件,哪里还会占内存?我运行paddle paddle的推荐的样例的时候,没有问题,是可以运行的啊,所以我很纳闷。。。

Adagch commented 6 years ago

知道哪里错了,在paddle.data_type.integer_value后面的数字太大了,导致的。改小了以后就好了。