Element-Research / rnn

Recurrent Neural Network library for Torch7's nn
BSD 3-Clause "New" or "Revised" License
938 stars 314 forks source link

error: not enough memory when running noise-contrastive-estimate example #429

Closed eric-haibin-lin closed 6 years ago

eric-haibin-lin commented 6 years ago

I got the following error message when I run the NCE example:

ubuntu@ip-172-31-45-236:~/rnn/examples$ th noise-contrastive-estimate.lua --progress --earlystop 50 --cuda --device 1 --seqlen 20 --hiddensize '{200,200}' --batchsize 20 --startlr 1
 --uniform 0.1 --cutoff 5 --schedule '{[5]=0.5,[6]=0.25,[7]=0.125,[8]=0.0625,[9]=0.03125,[10]=0.015625,[11]=0.0078125,[12]=0.00390625}'
{
   Z : 1
   batchsize : 20
   continue : ""
   cuda : true
   cutoff : 5
   device : 1
   dontsave : false
   dropout : 0
   earlystop : 50
   hiddensize : {200,200}
   id : "gbw:ip-172-31-45-236:1513972326:1"
   inputsize : 200
   k : 100
   maxepoch : 1000
   maxnormout : -1
   minlr : 1e-05
   momentum : 0.9
   profile : false
   progress : true
   projsize : -1
   rownoise : false
   saturate : 400
   savepath : "/home/ubuntu/save/rnnlm"
   schedule : {[5]=0.5,[6]=0.25,[7]=0.125,[8]=0.0625,[9]=0.03125,[10]=0.015625,[11]=0.0078125,[12]=0.00390625}
   seqlen : 20
   silent : false
   startlr : 1
   tiny : false
   trainsize : 400000
   uniform : 0.1
   validsize : 40000
   version : 6
}
loading /home/ubuntu/rnn/examples/BillionWords/train_data.th7
Formatting raw tensor into table of sequences
saving cache /home/ubuntu/rnn/examples/BillionWords/train_data.cache.t7
/home/ubuntu/src/torch/install/bin/luajit: not enough memory

But based on my observation in top, there're always sufficient memory as well as disk space:

ubuntu@ip-172-31-45-236:~/rnn/examples$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            361G   12K  361G   1% /dev
tmpfs            73G  912K   73G   1% /run
/dev/xvda1      126G   80G   42G  66% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
none            5.0M     0  5.0M   0% /run/lock
none            361G     0  361G   0% /run/shm
none            100M     0  100M   0% /run/user
/dev/xvdf        63G  5.7G   54G  10% /home/ubuntu/data

What went wrong?