Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (Qwen2.5, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
3.49k
stars
299
forks
source link
训练qwen1.5-moe-A2.7B-chat速度缓慢,GPU利用率低 #868
Closed
yangzhipeng1108 closed 4 months ago
CUDA_VISIBLE_DEVICES=0 \ python3 llm_sft.py \ --model_type qwen1half-moe-a2_7b-chat \ --model_id_or_path /root/yovole/qwen/Qwen1.5-MoE-A2.7B-Chat \ --sft_type lora \ --tuner_backend swift \ --dtype AUTO \ --output_dir output \ --dataset dureader-robust-zh \ --train_dataset_sample 10000 \ --num_train_epochs 1 \ --max_length 1024 \ --check_dataset_strategy warning \ --lora_rank 8 \ --lora_alpha 32 \ --lora_dropout_p 0.05 \ --lora_target_modules ALL \ --gradient_checkpointing true \ --batch_size 1 \ --weight_decay 0.1 \ --learning_rate 1e-4 \ --gradient_accumulation_steps 16 \ --max_grad_norm 0.5 \ --warmup_ratio 0.03 \ --eval_steps 100 \ --save_steps 100 \ --save_total_limit 2 \ --logging_steps 10 \ --use_flash_attn true \ --self_cognition_sample 1000 \ --custom_train_dataset_path /root/yovole/qwen/data/alpaca-gpt4-data-zh/alpaca_gpt4_data_zh.json \ --custom_val_dataset_path /root/yovole/qwen/data/alpaca-gpt4-data-zh/alpaca_gpt4_data_zh.json \ --model_name 卡卡罗特 \ --model_author 陶白白