harvardnlp / seq2seq-attn

Sequence-to-sequence model with LSTM encoder/decoders and attention
http://nlp.seas.harvard.edu/code
MIT License
1.26k stars 278 forks source link

Data parallel training #59

Closed hrishikeshvganu closed 8 years ago

hrishikeshvganu commented 8 years ago

I noticed that the model allows for "model parallel" training. Does the code automatically support "data parallel" training(without explicitly assigning data to different GPUs)?

My goal is to see if I can get approximately linear speed up using the 16 GPU machines from AWS.

yoonkim commented 8 years ago

Hi, the code does not support data parallelism. I haven't experimented with data parallelism in torch but from some light googling it seems like it should be possible--let us know if you end up implementing it and we'll merge!