Faster importing - Githubissues

Hello, I think the task importing is too slow to use it often and conveniently, the loading of 1 task, 4 translations, each 3003 sentences, takes 31 minutes on my laptop. (Btw. on other machines I tried the importing time was similar and it didn't show up on the web. But I can't confirm the dependencies are installed correctly.)

Any plans/ideas/comments, how to contribute to make it faster? I have an idea to rewrite the whole watcher to Python, use easy multiprocessing and fast libraries. But it would require too much work.

Another option could be to make it parallel. Since I'm not a php programmer, I can do only a bash script launching php -f www/index.php Background:Tasks:Import --folder=./data/experiment processes in parallel. Can anyone already share such scripts or experiences?

d@e:~/tmp/MT-ComparEval$ bash bin/watcher.sh 
Watcher is watching folder: ./data
[14-Mar-2018 11:42:24]  New experiment called de-cs_multitask was found
[14-Mar-2018 11:42:24]  source.txt used as a source source.
[14-Mar-2018 11:42:24]  de-cs_multitask has 3003 source sentences
[14-Mar-2018 11:42:24]  reference.txt used as a reference source.
[14-Mar-2018 11:42:24]  de-cs_multitask has 3003 reference sentences
[14-Mar-2018 11:42:27]  Experiment de-cs_multitask uploaded successfully.
[14-Mar-2018 11:42:27]  Importing task: de-cs_multitask:ner
[14-Mar-2018 11:42:27]  translation.txt used as a translation source.
[14-Mar-2018 11:42:27]  ner has 3003 translation sentences
[14-Mar-2018 11:42:44]  Generating BLEU samples for ner.
[14-Mar-2018 11:44:34]  Samples generated.
[14-Mar-2018 11:44:37]  Generating BLEU-cased samples for ner.
[14-Mar-2018 11:46:36]  Samples generated.
[14-Mar-2018 11:48:32]  Precomputing n-grams for ner.
[14-Mar-2018 11:48:32]  N-grams precomputation done.
[14-Mar-2018 11:48:32]  Task ner uploaded successfully
[14-Mar-2018 11:48:33]  Importing task: de-cs_multitask:pos_small
[14-Mar-2018 11:48:33]  translation.txt used as a translation source.
[14-Mar-2018 11:48:33]  pos_small has 3003 translation sentences
[14-Mar-2018 11:48:49]  Generating BLEU samples for pos_small.
[14-Mar-2018 11:50:54]  Samples generated.
[14-Mar-2018 11:50:57]  Generating BLEU-cased samples for pos_small.
[14-Mar-2018 11:53:15]  Samples generated.
[14-Mar-2018 11:55:07]  Precomputing n-grams for pos_small.
[14-Mar-2018 11:55:50]  N-grams precomputation done.
[14-Mar-2018 11:55:50]  Task pos_small uploaded successfully
[14-Mar-2018 11:55:52]  Importing task: de-cs_multitask:pos_full
[14-Mar-2018 11:55:53]  translation.txt used as a translation source.
[14-Mar-2018 11:55:53]  pos_full has 3003 translation sentences
[14-Mar-2018 11:56:11]  Generating BLEU samples for pos_full.
[14-Mar-2018 11:58:23]  Samples generated.
[14-Mar-2018 11:58:26]  Generating BLEU-cased samples for pos_full.
[14-Mar-2018 12:00:44]  Samples generated.
[14-Mar-2018 12:02:49]  Precomputing n-grams for pos_full.
[14-Mar-2018 12:04:15]  N-grams precomputation done.
[14-Mar-2018 12:04:15]  Task pos_full uploaded successfully
[14-Mar-2018 12:04:20]  Importing task: de-cs_multitask:baseline
[14-Mar-2018 12:04:20]  translation.txt used as a translation source.
[14-Mar-2018 12:04:20]  baseline has 3003 translation sentences
[14-Mar-2018 12:04:38]  Generating BLEU samples for baseline.
[14-Mar-2018 12:06:55]  Samples generated.
[14-Mar-2018 12:06:59]  Generating BLEU-cased samples for baseline.
[14-Mar-2018 12:09:25]  Samples generated.
[14-Mar-2018 12:11:28]  Precomputing n-grams for baseline.
[14-Mar-2018 12:13:34]  N-grams precomputation done.
[14-Mar-2018 12:13:34]  Task baseline uploaded successfully

ondrejklejch / MT-ComparEval

Faster importing #80