Closed AndDoIt closed 2 years ago
It happens because in line 166 and 218 of model.py, the number of dimensions are too big, like if you have 100 words, with batch size of 20 and 300 hidden state neurons, the operation would process (100 100 20 900 times 900 300) this much dimensions. You either reduce the hidden neurons or batch size, or you get a GPU with greater memory capacity.
Thanks a lot, I solved it!
When I finished the model training and began to test, the OOM occurred, since the model does not optimize with multi-gpu, did you have this problem before?