QwenLM / Qwen2.5-Coder

Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.
2.94k stars 192 forks source link

the model continuously outputs repeated tokens #163

Closed Yhw109 closed 3 days ago

Yhw109 commented 1 week ago

Hello,

I am using the Qwen2.5_coder_3B model, and I have noticed that if I do not use the prompt format specified in Qwen2.5-Coder/evaluation/text_to_sql, the model continuously outputs repeated tokens until it reaches the max_tokens limit. For example, when using the prompt from Qwen2.5-Coder/examples/Qwen2.5-Coder.py to "#write a quick sort algorithm."

Do I need to use special tokens to organize my prompt in order to resolve the issue of the model producing repeated outputs?

Thank you!

huybery commented 1 week ago

Please pull the latest version of the model and provide a full prompt for us to reproduce if you still have problems.

Yhw109 commented 4 days ago

I am using the Qwen2.5-coder-base to train my own text-to-SQL model. Before proceeding, I would like to evaluate the performance of Qwen2.5-coder-base on the Spider and Bird datasets (the technical report only provided results for the instruct model).

However, when directly using the instructions for the instruct model from the Qwen2.5-Coder/qwencoder-eval/instruct/bird-spider folder, the base model seems to continuously output repeated tokens. I think I might have used the wrong prompt format. image

How can I obtain the performance for the Qwen2.5-coder-base model on these two datasets? And I noticed that you uploaded a fine-tuning folder. Can I use this folder directly to fine-tune the base model? Are there any specific requirements for the prompt format for text-to-SQL task? Thank you!!!