mit-han-lab / hardware-aware-transformers

[ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
https://hat.mit.edu
Other
329 stars 50 forks source link

lower loss but the BLEU is 0 #14

Open leo038 opened 2 years ago

leo038 commented 2 years ago

I have trained a model, the loss in train and valid dataset is very low(lower than 2), But when I evaluated the BLEU, I found it 0. And I check the translation result, all the result is the same, like "the the the the ..." and etc. This is very strange, what could be the reason?