Open Gldkslfmsd opened 6 years ago
As for Python fork of MT-ComparEval: yes, it would be a lot of work, but could take advantage of sacreBLEU or numpy implementations of BLEU. That said, the current implementation is based on storing all n-grams in a database, which has speed benefits for comparing several systems (which is the MT-ComparEval's use case) and I am not sure how to combine this with vectorized numpy implementation.
There are few simple tricks how to speedup the import:
Hello, I think the task importing is too slow to use it often and conveniently, the loading of 1 task, 4 translations, each 3003 sentences, takes 31 minutes on my laptop. (Btw. on other machines I tried the importing time was similar and it didn't show up on the web. But I can't confirm the dependencies are installed correctly.)
Any plans/ideas/comments, how to contribute to make it faster? I have an idea to rewrite the whole watcher to Python, use easy multiprocessing and fast libraries. But it would require too much work.
Another option could be to make it parallel. Since I'm not a php programmer, I can do only a bash script launching
php -f www/index.php Background:Tasks:Import --folder=./data/experiment
processes in parallel. Can anyone already share such scripts or experiences?