harvardnlp / seq2seq-attn

Sequence-to-sequence model with LSTM encoder/decoders and attention
http://nlp.seas.harvard.edu/code
MIT License
1.26k stars 278 forks source link

cudnn and combined char+word model #36

Closed boknilev closed 8 years ago

boknilev commented 8 years ago

Hi,

Thanks for building this nice tool!

  1. Do you have any plans to incorporate cudnn's LSTM implementation (e.g. from cudnn.torch) for speed-up?
  2. Is there an option to use a combined char+word model in the same way as in the original language modelling work?
yoonkim commented 8 years ago

Hi!

  1. we do not currently have plans to incorporate this
  2. currently we do not support this, but I imagine this won't be too difficult to incorporate (basically, you will need to modify models.lua to be similar to the original LSTMTDNN.lua code, and also modify the inputs/gradients during training. Finally, you want to modify data.lua so you can get both characters and words as inputs).

Yoon