Closed eric-haibin-lin closed 6 years ago
I got the following error message when I run the NCE example:
ubuntu@ip-172-31-45-236:~/rnn/examples$ th noise-contrastive-estimate.lua --progress --earlystop 50 --cuda --device 1 --seqlen 20 --hiddensize '{200,200}' --batchsize 20 --startlr 1 --uniform 0.1 --cutoff 5 --schedule '{[5]=0.5,[6]=0.25,[7]=0.125,[8]=0.0625,[9]=0.03125,[10]=0.015625,[11]=0.0078125,[12]=0.00390625}' { Z : 1 batchsize : 20 continue : "" cuda : true cutoff : 5 device : 1 dontsave : false dropout : 0 earlystop : 50 hiddensize : {200,200} id : "gbw:ip-172-31-45-236:1513972326:1" inputsize : 200 k : 100 maxepoch : 1000 maxnormout : -1 minlr : 1e-05 momentum : 0.9 profile : false progress : true projsize : -1 rownoise : false saturate : 400 savepath : "/home/ubuntu/save/rnnlm" schedule : {[5]=0.5,[6]=0.25,[7]=0.125,[8]=0.0625,[9]=0.03125,[10]=0.015625,[11]=0.0078125,[12]=0.00390625} seqlen : 20 silent : false startlr : 1 tiny : false trainsize : 400000 uniform : 0.1 validsize : 40000 version : 6 } loading /home/ubuntu/rnn/examples/BillionWords/train_data.th7 Formatting raw tensor into table of sequences saving cache /home/ubuntu/rnn/examples/BillionWords/train_data.cache.t7 /home/ubuntu/src/torch/install/bin/luajit: not enough memory
But based on my observation in top, there're always sufficient memory as well as disk space:
top
ubuntu@ip-172-31-45-236:~/rnn/examples$ df -h Filesystem Size Used Avail Use% Mounted on udev 361G 12K 361G 1% /dev tmpfs 73G 912K 73G 1% /run /dev/xvda1 126G 80G 42G 66% / none 4.0K 0 4.0K 0% /sys/fs/cgroup none 5.0M 0 5.0M 0% /run/lock none 361G 0 361G 0% /run/shm none 100M 0 100M 0% /run/user /dev/xvdf 63G 5.7G 54G 10% /home/ubuntu/data
What went wrong?
I got the following error message when I run the NCE example:
But based on my observation in
top
, there're always sufficient memory as well as disk space:What went wrong?