Open ghost opened 8 years ago
Happens to me too, the current implementation needs about 20 times as RAM as the size of the input file. 500MB of input train fine using something in the neighborhood of 10GB of RAM
Thanks for the report. @Alicemargatroid @izqui. I need some help figuring out the right way to fix this problem.
How big is the data.npy
file? Is it 20 times large as well?
Should we optimize the DS or switch to a streaming loader?
@sherjilozair I think a streaming loader would be best
@sherjilozair Right now char-rnn is using 13.54 GB of RAM and this is the size of the data
files:
-rw-r--r-- 1 root root 6254212000 Jun 6 16:50 data.npy
-rw-r--r-- 1 root root 781776490 Jun 6 16:22 input.txt
-rw-r--r-- 1 root root 1357 Jun 6 16:47 vocab.pkl
When training on large files, I get a MemoryError despite having more than enough memory to hold the file: