Closed goodbai-nlp closed 3 years ago
Hi,
You need to use the JAMR tokenizer (https://github.com/redpony/cdec/blob/master/corpus/tokenize-anything.sh) to tokenize both outputs and gold sentences. Then you use the scorer we provide (https://github.com/SapienzaNLP/spring/blob/main/bin/eval_bleu.py). Sorry it is not so straightforward, but this was done in order to ensure comparability with previous approaches.
Thx, I get a BLEU of 45.1 now.
Thanks for your nice work! I met a few questions when trying to reproduce the AMR2Text results on AMR2.0.
but got a BLEU score of 41.78, which is lower than the result (45.3) reported in your paper.
but only got a BLEU score of 42.3.
I have no idea what is going wrong, could anyone give me some suggestions?
My virtual environment is available at here.