Training of Qwen2 - Githubissues

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

https://arxiv.org/pdf/2406.16858

Apache License 2.0

814 stars 81 forks source link

Training of Qwen2 #125

Open jzzzf opened 2 months ago

jzzzf commented 2 months ago

Hi, EAGLE team. I want to train my EAGLE-Qwen2 model on Chinese-ShareGPT dataset, to generate train data, I modified the code in ge_data_all_llama3.py, I only changed sep = "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" to sep = "<|im_end|>\n<|im_start|>assistant\n" and sep2="<|eot_id|><|start_header_id|>user<|end_header_id|>" to sep2="<|im_end|>\n<|im_start|>user" . But when I train the model, I encounter exploding gradient. Is there anything I need to take care when I generate train data of Qwen2?

Liyuhui-12 commented 2 months ago

You can refer to eagle/ge_data/ge_data_all_qwen2.py.