Avoid loading the whole training set to memory

robertostling / hnmt

Helsinki Neural Machine Translation system

GNU General Public License v3.0

29 stars 16 forks source link

Avoid loading the whole training set to memory #7

Closed robertostling closed 7 years ago

robertostling commented 7 years ago

Currently the whole training data set is loaded to RAM, but this obviously does not scale. At some point we need to fix this, if we want to train with huge corpora.

robertostling commented 7 years ago

I've written a module for this now, will integrate into HNMT later. Will come in handy now that CSC is charging for RAM usage.

robertostling commented 7 years ago

Implemented in 6ebca3ba