moses-smt / mosesdecoder

Moses, the machine translation system
http://www.statmt.org/moses
GNU Lesser General Public License v2.1
1.58k stars 778 forks source link

`--multi-moses` not working with IRSTLM? #141

Closed ypeels closed 8 years ago

ypeels commented 8 years ago

I am trying to use the --multi-moses flag during tuning to work around the fact that IRSTLM does not support multi-threaded decoding, but I run into the following error message:

mert-moses.pl $corpus1 $corpus2 $mosesdir/bin/moses $traindir/model/moses.ini \
    --multi-moses --decoder-flags '--threads 8'
ERROR: Failed to run 'moses --threads 8 -config filtered/moses.ini -show-weights'

Running the command from the error message from the mert-work directory gives:

Exception: Error: 8 number of threads specified but IRST LM is not threadsafe.

It seems that ___DECODER_FLAGS is being used differently in two places in mert-moses.pl:

I can work around this by removing __DECODER_FLAGS from the latter, but I can't imagine this will work for all use cases...

Thank you!

ugermann commented 8 years ago

Is there any particular reason that you must use IRSTLM? Apparently newer versions of IRSTLM are thread-safe (check their github repo at https://github.com/irstlm-team/irstlm); but personally I usually stick with KenLM, which is thread-safe.

On Sun, Jan 31, 2016 at 12:42 AM, Jonathan Chen notifications@github.com wrote:

I am trying to use the --multi-moses flag during tuning to work around the fact that IRSTLM does not support multi-threaded decoding, but I run into the following error message:

mert-mosespl $corpus1 $corpus2 $mosesdir/bin/moses $traindir/model/mosesini \ --multi-moses --decoder-flags '--threads 8' ERROR: Failed to run 'moses --threads 8 -config filtered/mosesini -show-weights'

Running the command from the error message from the mert-work directory gives:

Exception: Error: 8 number of threads specified but IRST LM is not threadsafe

It seems that ___DECODER_FLAGS is being used differently in two places in mert-mosespl https://githubcom/moses-smt/mosesdecoder/blob/master/scripts/training/mert-mosespl :

I can work around this by removing __DECODER_FLAGS from the latter, but I can't imagine this will work for all use cases

Thank you!

— Reply to this email directly or view it on GitHub https://github.com/moses-smt/mosesdecoder/issues/141.

Ulrich Germann Senior Researcher School of Informatics University of Edinburgh

ypeels commented 8 years ago

Hi Uli,

Thank you for the very fast reply! I am trying to reproduce some baseline literature results which reported using IRSTLM. (KenLM seems to result in slightly different BLEU and TER scores).

Thanks for pointing out that IRSTLM is now thread-safe. But wouldn't moses still throw an exception after seeing IRSTLM in moses.ini?

Best, Jonathan

ugermann commented 8 years ago

It probably would. You'd have to nag the IRSTLM team into maintaining their contribution to Moses, or do (and test) it yourself. Unfortunately, we currently don't have any funding for anyone to curate the entire Moses code basis, so developers and contributors tend to focus their attention on things that are relevant to them.

If experimental replication is your primary goal, you should stick to the version of Moses (and OS, perl, python, gcc, boost, etc.) that the original work used anyway (changes in the code might not just affect the LM, but also other aspects of performance). If the differences between use of KenML vs IRSTLM are statistically significant, that would be a finding worth publishing, but I doubt they are. If they are not, why should we care about the differences?

On Sun, Jan 31, 2016 at 1:07 AM, Jonathan Chen notifications@github.com wrote:

Hi Uli,

Thank you for the very fast reply! I am trying to reproduce some baseline literature results which reported using IRSTLM. (KenLM seems to result in slightly different BLEU and TER scores).

Thanks for pointing out that IRSTLM is now thread-safe. But wouldn't moses still throw an exception after seeing IRSTLM in moses.ini?

Best, Jonathan

— Reply to this email directly or view it on GitHub https://github.com/moses-smt/mosesdecoder/issues/141#issuecomment-177352581 .

Ulrich Germann Senior Researcher School of Informatics University of Edinburgh

nicolabertoldi commented 8 years ago

Some time ago, I modified Moses to correctly handle the thread-safe version of IRSTLM, without throwing any exception. Probably, in some next merge/revert action this part of code was removed. I will fix it asap.

best Nicola

hieuhoang commented 8 years ago

closing this thread unless you guys are still working on it

ypeels commented 8 years ago

The issue persists for me in the latest build b760ad8a7edbe253a9b366f088d51919a1eb67a5. I do not think did the fix for this got checked in; should this thread be reopened?

In the history for moses/LM/IRST.cpp, it looks like changes in commits 165092885c2f86a7b167b091ea8ccec2a1b159e1 and a1a1b6f2da3f258576a06e053916e1f9276f798f by @nicolabertoldi were modified and then overwritten in commits 857ee21d5065b9b5f3a41ee12611fd1670ce376d and 54d0ab36f4a8467e6b5fdcd8cb8db4e21667f7eb by @ugermann, all in October 2015.

Thank you!

hieuhoang commented 8 years ago

ok. I'll reopen it. Hopefully, someone will scratch this itch

mjdenkowski commented 8 years ago

I just checked in a pull request with what should be a fix for this. Can someone who uses IRSTLM please test it?

ypeels commented 8 years ago

Pull request #147 works for me (commit c8749aa69a8916afafe5c2523103d8c0c5ec97ed); thank you!

ypeels commented 8 years ago

Moved the underlying issue of the "thread-safe version of IRSTLM" mentioned by @nicolabertoldi to its own issue where it belongs: #148