ondrejklejch / MT-ComparEval

Tool for comparison and evaluation of machine translation.
Apache License 2.0
56 stars 14 forks source link

generating TER samples is slow #78

Closed Gldkslfmsd closed 6 years ago

Gldkslfmsd commented 6 years ago

Hello, I'm running watcher again on the data mentioned in #77 with a fresh installation. The TER samples generating is really slow, it takes more than 1 hour for 3000 sentences. Can you speed it up or disable it by default?

Watcher is watching folder: ./data
[17-Jan-2018 12:26:59]  New experiment called de-cs_BPE_boundary_mark was found
[17-Jan-2018 12:26:59]  source.txt used as a source source.
[17-Jan-2018 12:26:59]  de-cs_BPE_boundary_mark has 3000 source sentences
[17-Jan-2018 12:26:59]  reference.txt used as a reference source.
[17-Jan-2018 12:26:59]  de-cs_BPE_boundary_mark has 3000 reference sentences
[17-Jan-2018 12:27:02]  Experiment de-cs_BPE_boundary_mark uploaded successfully.
[17-Jan-2018 12:27:02]  New experiment called de-cs_BPE_vs_STE was found
[17-Jan-2018 12:27:03]  source.txt used as a source source.
[17-Jan-2018 12:27:03]  de-cs_BPE_vs_STE has 3000 source sentences
[17-Jan-2018 12:27:03]  reference.txt used as a reference source.
[17-Jan-2018 12:27:03]  de-cs_BPE_vs_STE has 3000 reference sentences
[17-Jan-2018 12:27:06]  Experiment de-cs_BPE_vs_STE uploaded successfully.
[17-Jan-2018 12:27:06]  New experiment called de-cs_morph_segm was found
[17-Jan-2018 12:27:06]  source.txt used as a source source.
[17-Jan-2018 12:27:06]  de-cs_morph_segm has 3000 source sentences
[17-Jan-2018 12:27:06]  reference.txt used as a reference source.
[17-Jan-2018 12:27:06]  de-cs_morph_segm has 3000 reference sentences
[17-Jan-2018 12:27:10]  Experiment de-cs_morph_segm uploaded successfully.
[17-Jan-2018 12:27:10]  New experiment called en-cs was found
[17-Jan-2018 12:27:10]  source.txt used as a source source.
[17-Jan-2018 12:27:10]  en-cs has 3000 source sentences
[17-Jan-2018 12:27:10]  reference.txt used as a reference source.
[17-Jan-2018 12:27:10]  en-cs has 3000 reference sentences
[17-Jan-2018 12:27:14]  Experiment en-cs uploaded successfully.
[17-Jan-2018 12:27:14]  Importing task: de-cs_BPE_boundary_mark:AZ
[17-Jan-2018 12:27:14]  translation.txt used as a translation source.
[17-Jan-2018 12:27:14]  AZ has 3000 translation sentences
[17-Jan-2018 12:27:26]  Generating BLEU samples for AZ.
[17-Jan-2018 12:29:54]  Samples generated.
[17-Jan-2018 12:29:54]  Generating TER samples for AZ.
(killed at 13:32)
martinpopel commented 6 years ago

What about just disabling bootstrap for TER in the default setup? https://github.com/choko/MT-ComparEval/blob/39027307ef8131ac48b2d1ac6585ba42d4ac403f/app/config/config.neon#L152

A proper solution would be to cache per-sentence TER statistics and resample just these statistics, thus not executing the TER binaries 1000 times, but I am not sure if @lefterav has time to implement this.

ondrejklejch commented 6 years ago

Bootstrap resampling turned off for TER in 5d48281396e534545574c82521826b4709b15409.

Gldkslfmsd commented 6 years ago

How do I apply the changes in config? I commented this lines:

TER: [
            class: @terMetric,
            case_sensitive: False,
                compute_bootstrap: True, 
],

, then restarted watcher and it's generating TER samples again. The same after server restart. Do I have to use a fresh installation again?

ondrejklejch commented 6 years ago

Checkout the most recent version and remove cache (rm -rf ./temp/*).

Gldkslfmsd commented 6 years ago

Thanks, @choko !

lefterav commented 6 years ago

Good point