Open LiuAlex1109 opened 3 months ago
It generally doesn't happen, but it is most probably because of TextStreamer being called twice, you can use .split() wisely and extract the respective answer, for reference you can use this link. If the problem still bother, mind sharing the notebook link or the image of snippet.
It's possible the EOS token isn't being generated - I suggest adding 5 EOS tokens or so to the finetuning dataset
Also since you're doing another language, I'm assuming you're doing continued pretraining on Llama-3? It's possible the EOS token is being suppressed - so ye try 5 EOS
I trained model with unsloth, and when i input some questions to eval it, the model always output extra contents as shown at below:
my code:
Could someone tell me how to solve it, thanks.