Open yvesscherrer opened 1 year ago
It seems most probable that the process was killed due to exceeding memory limits. Eflomal is using a considerable amount of memory for large inputs, apparently growing linearly with the corpus size. For a corpus of 20 million sentence pairs, it used 10 gigabytes of memory.
Possible solutions:
The score
step and filter
with filterfalse=True
automatically do chunking, but the normal filter does not. Maybe there should be an option for that.
Alignment model creation works fine, but during filtering Eflomal crashes with the following error message:
The Eflomal unittest (test_eflomal.py) runs fine:
The OpusFilter unit test also seems to run fine: