Closed samsontmr closed 2 years ago
There's no --force
for fairseq-train
. Should also document how to silence that warning (although to be honest it ought to be silencing itself after the first alert anyway, IMHO...).
I am also trying to reproduce the score of transformer.wmt14.en-fr
model, but I am not able to.
Here is the script I use.
#! /usr/bin/bash
# download the data
data=wmt14_en_fr
mkdir -p $data
sacrebleu -t wmt14 -l en-fr --echo src > $data/test.raw.en
sacrebleu -t wmt14 -l en-fr --echo ref > $data/test.raw.fr
split=test
model=wmt14.en-fr.joined-dict.transformer
src=en
tgt=fr
maxtokens=16000
git clone https://github.com/moses-smt/mosesdecoder.git
set -e
SCRIPTS=mosesdecoder/scripts
TOKENIZER=$SCRIPTS/tokenizer/tokenizer.perl
CLEAN=$SCRIPTS/training/clean-corpus-n.perl
NORM_PUNC=$SCRIPTS/tokenizer/normalize-punctuation.perl
REM_NON_PRINT_CHAR=$SCRIPTS/tokenizer/remove-non-printing-char.perl
# echo "normalising punctuations and tokenizing the data"
cat $data/$split.raw.$src | $NORM_PUNC $src | $REM_NON_PRINT_CHAR | $TOKENIZER -threads 8 -aq -l $src > $data/$split.$src.tok
# echo "applying bpe"
subword-nmt apply-bpe -c $model/bpecodes < $data/$split.$src.tok > $data/$split.$src
# echo "converting into binary form"
fairseq-preprocess "--"$split"pref" $data/$split --destdir data-bin/$data --srcdict $model/dict.$src.txt --tgtdict $model/dict.$tgt.txt --workers 8 -s $src -t $tgt --only-source
# echo "copying dictionary"
cp $model/dict.$tgt.txt data-bin/$data/
echo "generating hypothesis"
fairseq-generate data-bin/$data/ --path $model/model.pt --skip-invalid-size-inputs-valid-test \
--max-tokens $maxtokens --remove-bpe --gen-subset $split --beam 4 --lenpen 0.6 | tee $data/$split.$src.out
grep ^H $data/$split.$src.out | cut -c3- | sort -nk1 | cut -f3 | ./mosesdecoder/scripts/tokenizer/detokenizer.perl -q > $data/$split.$src.hyp
cat $data/$split.$src.hyp | sacrebleu -t wmt14 -l $src-$tgt
# BLEU+case.mixed+lang.en-fr+numrefs.1+smooth.exp+test.wmt14+tok.13a+version.1.5.1 = 35.6 62.7/41.7/29.4/20.9 (BP = 1.000 ratio = 1.047 hyp_len = 80924 ref_len = 77306)
As you can see, I get a BLEU score of 35.6 - similar to what @samsontmr reported. But the paper reports 41.4 BLEU score.
@myleott, could you point what I am doing wrong, and how can I get close to the reported score. Thanks!
cc @edunov @michaelauli
@samsontmr Were you able to reproduce the results?
This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!
Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!
Hi! I tried running
generate
to evaluatetransformer.wmt14.en-fr
on the WMT'14 test set but was only able to get a BLEU score of 35.42. I ranprepare-wmt14en2fr.sh
andfairseq-preprocess
on the data beforehand as well. Could you share the command for evaluating the Transformer ENFR WMT'14 model?Here is what I'm using:
I tried a beam of 5 as well but it didn't give much better results.
I also got this message even though the file is tokenized:
Thanks!