Closed ypeels closed 8 years ago
Is there any particular reason that you must use IRSTLM? Apparently newer versions of IRSTLM are thread-safe (check their github repo at https://github.com/irstlm-team/irstlm); but personally I usually stick with KenLM, which is thread-safe.
On Sun, Jan 31, 2016 at 12:42 AM, Jonathan Chen notifications@github.com wrote:
I am trying to use the --multi-moses flag during tuning to work around the fact that IRSTLM does not support multi-threaded decoding, but I run into the following error message:
mert-mosespl $corpus1 $corpus2 $mosesdir/bin/moses $traindir/model/mosesini \ --multi-moses --decoder-flags '--threads 8' ERROR: Failed to run 'moses --threads 8 -config filtered/mosesini -show-weights'
Running the command from the error message from the mert-work directory gives:
Exception: Error: 8 number of threads specified but IRST LM is not threadsafe
It seems that ___DECODER_FLAGS is being used differently in two places in mert-mosespl https://githubcom/moses-smt/mosesdecoder/blob/master/scripts/training/mert-mosespl :
- to pass the number of instances to multi_mosespy https://githubcom/moses-smt/mosesdecoder/blob/master/scripts/generic/multi_mosespy in run_decoder()
- to pass the number of threads to a standard (auxiliary?) call to the decoder in get_featlist_from_moses()
I can work around this by removing __DECODER_FLAGS from the latter, but I can't imagine this will work for all use cases
Thank you!
— Reply to this email directly or view it on GitHub https://github.com/moses-smt/mosesdecoder/issues/141.
Ulrich Germann Senior Researcher School of Informatics University of Edinburgh
Hi Uli,
Thank you for the very fast reply! I am trying to reproduce some baseline literature results which reported using IRSTLM. (KenLM seems to result in slightly different BLEU and TER scores).
Thanks for pointing out that IRSTLM is now thread-safe. But wouldn't moses
still throw an exception after seeing IRSTLM
in moses.ini
?
Best, Jonathan
It probably would. You'd have to nag the IRSTLM team into maintaining their contribution to Moses, or do (and test) it yourself. Unfortunately, we currently don't have any funding for anyone to curate the entire Moses code basis, so developers and contributors tend to focus their attention on things that are relevant to them.
If experimental replication is your primary goal, you should stick to the version of Moses (and OS, perl, python, gcc, boost, etc.) that the original work used anyway (changes in the code might not just affect the LM, but also other aspects of performance). If the differences between use of KenML vs IRSTLM are statistically significant, that would be a finding worth publishing, but I doubt they are. If they are not, why should we care about the differences?
On Sun, Jan 31, 2016 at 1:07 AM, Jonathan Chen notifications@github.com wrote:
Hi Uli,
Thank you for the very fast reply! I am trying to reproduce some baseline literature results which reported using IRSTLM. (KenLM seems to result in slightly different BLEU and TER scores).
Thanks for pointing out that IRSTLM is now thread-safe. But wouldn't moses still throw an exception after seeing IRSTLM in moses.ini?
Best, Jonathan
— Reply to this email directly or view it on GitHub https://github.com/moses-smt/mosesdecoder/issues/141#issuecomment-177352581 .
Ulrich Germann Senior Researcher School of Informatics University of Edinburgh
Some time ago, I modified Moses to correctly handle the thread-safe version of IRSTLM, without throwing any exception. Probably, in some next merge/revert action this part of code was removed. I will fix it asap.
best Nicola
closing this thread unless you guys are still working on it
The issue persists for me in the latest build b760ad8a7edbe253a9b366f088d51919a1eb67a5. I do not think did the fix for this got checked in; should this thread be reopened?
In the history for moses/LM/IRST.cpp, it looks like changes in commits 165092885c2f86a7b167b091ea8ccec2a1b159e1 and a1a1b6f2da3f258576a06e053916e1f9276f798f by @nicolabertoldi were modified and then overwritten in commits 857ee21d5065b9b5f3a41ee12611fd1670ce376d and 54d0ab36f4a8467e6b5fdcd8cb8db4e21667f7eb by @ugermann, all in October 2015.
Thank you!
ok. I'll reopen it. Hopefully, someone will scratch this itch
I just checked in a pull request with what should be a fix for this. Can someone who uses IRSTLM please test it?
Pull request #147 works for me (commit c8749aa69a8916afafe5c2523103d8c0c5ec97ed); thank you!
Moved the underlying issue of the "thread-safe version of IRSTLM" mentioned by @nicolabertoldi to its own issue where it belongs: #148
I am trying to use the
--multi-moses
flag during tuning to work around the fact that IRSTLM does not support multi-threaded decoding, but I run into the following error message:Running the command from the error message from the
mert-work
directory gives:It seems that
___DECODER_FLAGS
is being used differently in two places in mert-moses.pl:run_decoder()
get_featlist_from_moses()
I can work around this by removing
__DECODER_FLAGS
from the latter, but I can't imagine this will work for all use cases...Thank you!