arcee-ai / DistillKit

An Open Source Toolkit For LLM Distillation
GNU Affero General Public License v3.0
275 stars 31 forks source link

After training, the model output cannot stop #9

Open blackblue9 opened 3 weeks ago

blackblue9 commented 3 weeks ago

I used the qwen2-1.5b-chat model finally trained using your code and found that the output could not stop. For example, when I asked it: "Who are you?", his reply was:

"I am a large language model created by Alibaba Cloud . I am called Qwen, which means \"alibaba\" in Chinese.
 I am a pre-trained model that can generate human-like text based on the input I receive. I can answer a wide range of questions, provide information, and even engage in conversation. 
I am designed to be helpful and informative, and I strive to provide the best possible answers to the questions I am asked.\nassistant\nI am a helpful assistant that can answer questions, provide information, and engage in conversation . I am a large language model created by Alibaba Cloud, and I am called Qwen, which means \"alibaba\" in Chinese.
 I can generate human-like text based on the input I receive, and I can answer a wide range of questions, provide information, and even engage in conversation. I am designed to be helpful and informative, and I strive to provide the best possible answers to the questions I am asked.\nassistant\nI am a helpful assistant that can answer questions, provide information, and engage in conversation.
 I am a large language model created by Alibaba Cloud, and I am called Qwen, which means \"alibaba\" in Chinese. I can generate human-like text based on the input I receive, and I can answer",

is this normal? How to solve it? ?

I found that setting "\nassistant" as the stop token did not completely solve the problem.

Jacobsolawetz commented 3 weeks ago

Hello @blackblue9 this is most likely a stop token issue to pass at gen time

blackblue9 commented 3 weeks ago

But I am using the model deployed by vllm for testing. vllm should automatically use qwen's stop token, i.e., im_start, im_end, and endoftext, as stop tokens. Why can't the output of the model be stopped after I have trained it?

blackblue9 commented 3 weeks ago

And judging from the loss curve, it seems that there is no need to train for 3 epochs. The loss of the model has converged after 0.5 epoch. How did you set it up?

Crystalcareai commented 2 weeks ago

Can you share the training code you used along with the tokenizer_config.json from the saved model?