merge lora后qwen1.5-7b变得特别大？

modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)

https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html

Apache License 2.0

4.03k stars 357 forks source link

merge lora后qwen1.5-7b变得特别大？ #715

Closed qianliyx closed 6 months ago

qianliyx commented 6 months ago

用qwen1.5-7b采用swift官网的自我认知微调后，开始使用CUDA_VISIBLE_DEVICES=0 swift export --ckpt_dir xxx --merge_lora true 进行权重合并，结果合并后模型变成了二十几个GB的大小，比原模型大出一倍？

是因为我在lora微调时dtype是auto自动选择fp32而不是fp16的原因吗？