OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
https://optimalscale.github.io/LMFlow/
Apache License 2.0
8.23k stars 822 forks source link

Hello , Can LMFlow support Qwen1.5-1.8B model Fine-tuning? #796

Closed 13416157913 closed 5 months ago

13416157913 commented 5 months ago

Hello , Can LMFlow support Qwen1.5-1.8B model Fine-tuning?

research4pan commented 5 months ago

Thanks for your interest in LMFlow! We are integrating that feature right now, hopefully supporting it in 12-48 hours. Please stay tuned for our latest update 😄

wheresmyhair commented 5 months ago

Hello , Can LMFlow support Qwen1.5-1.8B model Fine-tuning?

Hi, we've tested on Qwen1.5-1.8B and the script works fine. qwen2

Please make sure you include --lora_target_modules q_proj, v_proj (only for qwen models) in the finetune shell script: ft

Also, we strongly recommend you to:

  1. Use a conversation dataset to finetune the model. You could either:
    • To test the workflow, try to download a conversation dataset from out data server via:
      cd data && ./download.sh alpaca && cd -

      and specify the dataset to data/alpaca/train_conversation, or

    • Prepare your own conversation dataset (see here)
  2. Specify the conversation template to qwen2 for better performance.
13416157913 commented 5 months ago

Hello , Can LMFlow support Qwen1.5-1.8B model Fine-tuning?

Hi, we've tested on Qwen1.5-1.8B and the script works fine. qwen2

Please make sure you include --lora_target_modules q_proj, v_proj (only for qwen models) in the finetune shell script: ft

Also, we strongly recommend you to:

  1. Use a conversation dataset to finetune the model. You could either:

    • To test the workflow, try to download a conversation dataset from out data server via:

      cd data && ./download.sh alpaca && cd -

      and specify the dataset to data/alpaca/train_conversation, or

    • Prepare your own conversation dataset (see here)
  2. Specify the conversation template to qwen2 for better performance.

Thank you very much.