Hi, EAGLE team. I want to train my EAGLE-Qwen2 model on Chinese-ShareGPT dataset, to generate train data, I modified the code in ge_data_all_llama3.py, I only changed sep = "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" to sep = "<|im_end|>\n<|im_start|>assistant\n" and sep2="<|eot_id|><|start_header_id|>user<|end_header_id|>" to sep2="<|im_end|>\n<|im_start|>user" . But when I train the model, I encounter exploding gradient. Is there anything I need to take care when I generate train data of Qwen2?
Hi, EAGLE team. I want to train my EAGLE-Qwen2 model on Chinese-ShareGPT dataset, to generate train data, I modified the code in
ge_data_all_llama3.py
, I only changedsep = "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
tosep = "<|im_end|>\n<|im_start|>assistant\n"
andsep2="<|eot_id|><|start_header_id|>user<|end_header_id|>"
tosep2="<|im_end|>\n<|im_start|>user"
. But when I train the model, I encounter exploding gradient. Is there anything I need to take care when I generate train data of Qwen2?