Evaluate on BLEU with BPE tokens while training?

asyml / texar

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

https://asyml.io

Apache License 2.0

2.39k stars 372 forks source link

Evaluate on BLEU with BPE tokens while training? #262

Closed ehsan-soe closed 4 years ago

ehsan-soe commented 4 years ago

Hi, I wonder if it is possible to evaluate the model on a set of development set during training (to save the best model) based on BLEU score on BPE token ids? Or should we necessarily detokenize and decode back to natural text for bleu score calculation during training?

Thanks

gpengzhi commented 4 years ago

Hi, I think you should decode back to the text before calculating the BLEU score.