harvardnlp / seq2seq-attn

Sequence-to-sequence model with LSTM encoder/decoders and attention
http://nlp.seas.harvard.edu/code
MIT License
1.26k stars 278 forks source link

preprocess-shards compatibility with train #77

Closed hsajjad closed 7 years ago

hsajjad commented 7 years ago

Hi,

seq2seq-attn is taking a lot of RAM when used with large data set. We tried processing the data using preprocess-shards but getting the following error during training. Any help would be highly appreciated.

/torch/install/bin/luajit: .../torch/install/share/lua/5.1/hdf5/group.lua:312: HDF5Group:read() - no such child 'num_source_features' for [HDF5Group 33554432 /] stack traceback: [C]: in function 'error' .../torch/install/share/lua/5.1/hdf5/group.lua:312: in function 'read' ./s2sa/data.lua:69: in function '__init' .../torch/install/share/lua/5.1/torch/init.lua:91: in function 'new' train.lua:969: in function 'main' train.lua:1074: in main chunk [C]: in function 'dofile' ...ools/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670

guillaumekln commented 7 years ago

Hi,

See #69.