preprocess-shards compatibility with train

Hi,

seq2seq-attn is taking a lot of RAM when used with large data set. We tried processing the data using preprocess-shards but getting the following error during training. Any help would be highly appreciated.

/torch/install/bin/luajit: .../torch/install/share/lua/5.1/hdf5/group.lua:312: HDF5Group:read() - no such child 'num_source_features' for [HDF5Group 33554432 /] stack traceback: [C]: in function 'error' .../torch/install/share/lua/5.1/hdf5/group.lua:312: in function 'read' ./s2sa/data.lua:69: in function '__init' .../torch/install/share/lua/5.1/torch/init.lua:91: in function 'new' train.lua:969: in function 'main' train.lua:1074: in main chunk [C]: in function 'dofile' ...ools/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670

harvardnlp / seq2seq-attn

preprocess-shards compatibility with train #77