seq2seq with Pytorch cause GPU out of memory

Hello, @ZixuanLiang I am sorry, but the example implementations currently guarantees operation only with bAbI task. In other tasks, if the number of vocabulary is too large, the matrix of embedding layer and softmax layer becomes huge, causing out of memory. It is necessary to reduce vocabulary by converting low frequency words to unknown. Unfortunately ParlAI does not implement this feature. I plan to implement a dictionary agent (eg dict-minfreq, subword, sentencepiece) to solve this problem soon. Also, in case of seq2seq, if the input sentence is too long, the number of LSTM cells to be stored in memory increases, which causes out of memory. Simply it may be possible to avoid by reducing the hidden size. Thank you!

ryonakamura / parlai_agents

seq2seq with Pytorch cause GPU out of memory #1