Closed raju249 closed 7 years ago
Hi,
You can just run the translation with a trained model and use the benchmark/3rdParty/multi-bleu.perl
script to compute BLEU score.
Do we have any link to documentation showing how to use the mentioned script ?
It is the standard BLEU script so:
benchmark/3rdParty/multi-bleu.perl gold.txt < pred.txt
perl benchmark/3rdParty/multi-bleu.perl refer.txt < pred.txt
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = "hi.UTF-8",
LC_ALL = (unset),
LC_CTYPE = "UTF-8",
LANG = "hi_IN.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("hi_IN.UTF-8").
Use of uninitialized value in division (/) at benchmark/3rdParty/multi-bleu.perl line 129, <STDIN> line 1.
Use of uninitialized value in division (/) at benchmark/3rdParty/multi-bleu.perl line 129, <STDIN> line 1.
Use of uninitialized value in division (/) at benchmark/3rdParty/multi-bleu.perl line 129, <STDIN> line 1.
Use of uninitialized value in division (/) at benchmark/3rdParty/multi-bleu.perl line 129, <STDIN> line 1.
Use of uninitialized value $CORRECT[1] in multiplication (*) at benchmark/3rdParty/multi-bleu.perl line 134, <STDIN> line 1.
Use of uninitialized value $CORRECT[2] in multiplication (*) at benchmark/3rdParty/multi-bleu.perl line 134, <STDIN> line 1.
Use of uninitialized value $CORRECT[3] in multiplication (*) at benchmark/3rdParty/multi-bleu.perl line 134, <STDIN> line 1.
Use of uninitialized value $CORRECT[4] in multiplication (*) at benchmark/3rdParty/multi-bleu.perl line 134, <STDIN> line 1.
BLEU = 0.00, 0.0/0.0/0.0/0.0 (BP=1.000, ration=1.500)
@guillaumekln THis is what I get on running. Is this the expected ?
I think gold.txt
file is the reference file which is assumed to have correct transaltions and pred.txt
is assumed to have translated text by model.
Am I correct @guillaumekln ?
Correct me if wrong.
Thanks
Maybe you'll need to use the original script: https://raw.githubusercontent.com/moses-smt/mosesdecoder/master/scripts/generic/multi-bleu.perl
@jsenellart-systran Could you check the custom BLEU script?
Ok,
Anyways for now I am using the nltk
implementation of the bleu score.
hello - normally the implementation is right - the "Use of uninitialized value in division" errors means that there is no 1-gram, no 2-gram, no 3-gram and no 4-gram matching which is weird. Can you share your output and reference file?
Hi team, Apart from
perplexity
score is there any other way for measuring model performance within openNMT. For example do we have a way to calculate BELU score of the model built using openNMT ?