Closed lzl-mt closed 2 months ago
Hi!
I agree with you and they are indeed different!
The example at https://qwen.readthedocs.io/en/latest/training/SFT/example.html is maintained by us and the finetune.py script is also in this repo. As the original data format can be of high diversity, we have required data to be organized in a format similar to the OpenAI API, which is versatile and widely-used in the community.
The other example you showed is maintained by PAI (a different team from Alibaba Cloud) and does not apply to our codebase.
This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.
起始日期 | Start Date
No response
实现PR | Implementation PR
No response
相关Issues | Reference Issues
No response
摘要 | Summary
Hello, I would like to confirm what the form of finetuning data is. In https://qwen.readthedocs.io/en/latest/training/SFT/example.html, data format likes
However, in this examples: https://github.com/alibaba/Pai-Megatron-Patch/blob/main/examples/qwen1_5/README.md#Megatron-LM-Dense%E6%A8%A1%E5%9E%8B%E8%AE%AD%E7%BB%83%E6%B5%81%E7%A8%8B, dataset download and extracted like this:
Thanks!
基本示例 | Basic Example
No
缺陷 | Drawbacks
No
未解决问题 | Unresolved questions
No response