qe-team / marmot

MARMOT - the open source framework for feature extraction and machine learning, designed to estimate the quality of Machine Translation output
ISC License
21 stars 7 forks source link

alignment model creates temporary files which don't get deleted #3

Open chrishokamp opened 9 years ago

chrishokamp commented 9 years ago

these files get created when we run the alignment feature extractor:

They should either be deleted, or they should be put into a configurable directory with a logical name

varvara-l commented 9 years ago

Can it happen that we need the alignment model more than once in one run? If not, I'll change the code so that these files are deleted.

chrishokamp commented 9 years ago

we should provide the option to persist the files that we could need again (probably enabled by default), in a user-specified directory (by default ~/marmot-data). We first check if the files are there, if they are not, we build them. The prefix of the files should be their original name + some suffix -- i.e. europarl.en.align ...

On Mon, Feb 16, 2015 at 12:49 PM, varvara-l notifications@github.com wrote:

Can it happen that we need the alignment model more than once in one run? If not, I'll change the code so that these files are deleted.

— Reply to this email directly or view it on GitHub https://github.com/qe-team/marmot/issues/3#issuecomment-74503493.

chrishokamp commented 9 years ago

also the tests should delete alignment files if they create them. we should leave some small sample alignment files in the test_data/ directory for the cases where they are needed.

the real issue is here, because the aligner just dumps the align_model* files wherever it is being called from during the tests.