Hello , Can LMFlow support Qwen1.5-1.8B model Fine-tuning? - Githubissues

OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

https://optimalscale.github.io/LMFlow/

Apache License 2.0

8.23k stars 822 forks source link

Hello , Can LMFlow support Qwen1.5-1.8B model Fine-tuning? #796

Closed 13416157913 closed 5 months ago

13416157913 commented 5 months ago

Hello , Can LMFlow support Qwen1.5-1.8B model Fine-tuning?

research4pan commented 5 months ago

Thanks for your interest in LMFlow! We are integrating that feature right now, hopefully supporting it in 12-48 hours. Please stay tuned for our latest update 😄

wheresmyhair commented 5 months ago

Hello , Can LMFlow support Qwen1.5-1.8B model Fine-tuning?

Hi, we've tested on Qwen1.5-1.8B and the script works fine. qwen2

Please make sure you include --lora_target_modules q_proj, v_proj (only for qwen models) in the finetune shell script:

Also, we strongly recommend you to:

Use a conversation dataset to finetune the model. You could either:
- To test the workflow, try to download a conversation dataset from out data server via:
```
cd data && ./download.sh alpaca && cd -
```
  and specify the dataset to data/alpaca/train_conversation, or
- Prepare your own conversation dataset (see here)
Specify the conversation template to qwen2 for better performance.

13416157913 commented 5 months ago

Hello , Can LMFlow support Qwen1.5-1.8B model Fine-tuning?

Hi, we've tested on Qwen1.5-1.8B and the script works fine.

Please make sure you include --lora_target_modules q_proj, v_proj (only for qwen models) in the finetune shell script:

Also, we strongly recommend you to:
Use a conversation dataset to finetune the model. You could either:
To test the workflow, try to download a conversation dataset from out data server via:
cd data && ./download.sh alpaca && cd -
and specify the dataset to data/alpaca/train_conversation, or
Prepare your own conversation dataset (see here)
Specify the conversation template to qwen2 for better performance.

Thank you very much.