QwenLM / Qwen2.5-Coder

Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.
3.03k stars 202 forks source link

tokenizer.json changed after ms-swift sft #176

Closed oasis-0927 closed 1 day ago

oasis-0927 commented 4 days ago

First of all, thanks for this great work ! I'm using ms-swift(2.6.0.post1) and transformers(4.46.3) to LORA-finetune the qwen-2.5-coder-32B model.
The SFT and export process was processed fine but when I try to load the LORA-merged model, the tokenizer will raise "Data did not match any variant of untagged enum" error. I looked into the merged tokenizer.json file and I found out that the format of "merges" part was changed as follows: Before SFT: image

After SFT: image

Here are some discussions that may help. https://github.com/huggingface/transformers/issues/30324 https://github.com/unslothai/unsloth/issues/1059

cyente commented 2 days ago

hi, you may need to attach this issue to ms-swift.

oasis-0927 commented 1 day ago

hi, you may need to attach this issue to ms-swift.

Got it.