Closed FYYFU closed 1 year ago
Hi @FYYFU , Thank you for your interest in our work! I see that you closed the issue, but I will answer in case it still helps.
As far as I understand from the Huggingface code, predict_with_generate
is crucial when generating outputs for machine translation.
Otherwise, the model just predicts a single token at a time, and uses a kind of "teacher forcing" at test time. This is useful only to measure perplexity, but not for evaluating long outputs using metrics such as BLEU.
Let me know if anything is still unclear! Uri
Hi @FYYFU , Thank you for your interest in our work! I see that you closed the issue, but I will answer in case it still helps.
As far as I understand from the Huggingface code,
predict_with_generate
is crucial when generating outputs for machine translation.Otherwise, the model just predicts a single token at a time, and uses a kind of "teacher forcing" at test time. This is useful only to measure perplexity, but not for evaluating long outputs using metrics such as BLEU.
Let me know if anything is still unclear! Uri
Thanks for your reply! I'm sorry for my fault during setting the eval shell. Since i did not set overwrite_cache
in the KNN_MT eval shell, it use the cache instead of regenerating the dataset. Thus those keys saved in datastore are from the validation dataset, which finally lead to a higher BLEU and a lower speed on validation dataset.
Again, Thanks for your awesome work! :)
Hi. Thanks for your awesome project! For t5-small, I got the MT result on validation set. That is:
However, for KNN-MT, i got a different result. That is:
and the speed is too slow that i wonder if there is some wrong in my shell? KNN-MT Shell is:
original MT Shell is:
I notice that if i delete the
predict_with_generate
in KNN-MT shell, the speed will be the same as the original MT and theeval_loss
is also the same as original MT. But i can not get theeval_bleu
. Like:However, set
predict_with_generate
will not affect the speed of original MT. Could you please give some instruction to solve this problem?Thanks!