Closed chansonzhang closed 2 weeks ago
which finetune.py were you using? did you mix the use of Qwen and Qwen2 model and code?
we are also in the process of deprecating the finetune.py in this repo and we advise you to use training frameworks, including Axolotl, Llama-Factory, Swift, etc., to finetune your models with SFT, DPO, PPO, etc.
which finetune.py were you using? did you mix the use of Qwen and Qwen2 model and code?
I'm using Qwen/finetune.py.
I was finetuning Qwen 1.5 with model_max_length 32768, and encountered the issue#1307.
I was wondering if it is a bug in Qwen 1.5 and perhaps already been solved in Qwen 2.0. So I tried with Qwen 2.0 by the way, and was not expecting it to work.
However, the error messages and exception stacks are exactly the same when I changed the model. see comment in issue#1307.
I want to know what caused this error and how can I work around it.
we are also in the process of deprecating the finetune.py in this repo and we advise you to use training frameworks, including Axolotl, Llama-Factory, Swift, etc., to finetune your models with SFT, DPO, PPO, etc.
@jklj077 Thank you! Is there any quick start for that?
ping @yangjianxin1. he is currently working on the quick start using llama-factory.
for now, there is a very simple version at https://qwen.readthedocs.io/en/latest/training/SFT/llama_factory.html.
@yangjianxin1 where can I found src/train.py
mentioned in https://qwen.readthedocs.io/en/latest/training/SFT/llama_factory.html?
how should I set the param --flash_attn
There is an error msg "train.py: error: argument --flash_attn: expected one argument"
This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.
Originally posted by @chansonzhang in https://github.com/QwenLM/Qwen/issues/1307#issuecomment-2264659388