ufal / neuralmonkey

An open-source tool for sequence learning in NLP built on TensorFlow.
BSD 3-Clause "New" or "Revised" License
410 stars 106 forks source link

Add an output buffering for neuralmonkey-run #799

Open varisd opened 5 years ago

varisd commented 5 years ago

If I understand it correctly, when translating a dataset with neuralmonkey-run (using pretrained model), all the translations (or outputs) are held in the memory... (this loop here: https://github.com/ufal/neuralmonkey/blob/master/neuralmonkey/learning_utils.py#L323) ... and only after producing all the outputs, they are being written into the output (https://github.com/ufal/neuralmonkey/blob/master/neuralmonkey/learning_utils.py#L383)

When processing/translating large files, this might cause memory issues.

jlibovicky commented 5 years ago

Totally agree.

jindrahelcl commented 5 years ago

+1