cudnn and combined char+word model

harvardnlp / seq2seq-attn

Sequence-to-sequence model with LSTM encoder/decoders and attention

http://nlp.seas.harvard.edu/code

MIT License

1.26k stars 278 forks source link

Closed boknilev closed 8 years ago

boknilev commented 8 years ago

Hi,

Thanks for building this nice tool!

Do you have any plans to incorporate cudnn's LSTM implementation (e.g. from cudnn.torch) for speed-up?
Is there an option to use a combined char+word model in the same way as in the original language modelling work?

yoonkim commented 8 years ago

Hi!

we do not currently have plans to incorporate this
currently we do not support this, but I imagine this won't be too difficult to incorporate (basically, you will need to modify models.lua to be similar to the original LSTMTDNN.lua code, and also modify the inputs/gradients during training. Finally, you want to modify data.lua so you can get both characters and words as inputs).

Yoon