Closed Tendo33 closed 3 weeks ago
We also have some multi-turn data available, e.g., https://huggingface.co/datasets/Magpie-Align/Magpie-Pro-MT-300K-v0.1. Ideally, multi-turn datasets can help LLMs perform better. However, we found from our empirical analysis that the performance increase using multi-turn datasets is marginal. In other words, single-turn alignment data can already make LLMs good at multi-turn dialogues.
Thanks for the reply. BTW, I noticed in the script that for the qwen2 series, it doesn't seem to have prompts specifically designed for tasks like "translation", "code", "math". Is this because those prompt haven't been tested yet? Will they be added later on?
Indeed. We haven't tested "translation", "code" and "math" on the Qwen family. I think you can easily modify the config to support these tasks~
I've noticed that the dataset you posted consists of single-turn dialogues. If you don't include multi-turn dialogue data, will that affect the model's final performance? Looking forward to your reply.