如何保存fp16格式的模型权重

lufred8341 commented 1 month ago

Reminder

[X] I have read the README and searched the existing issues.

System Info

我用的是mac系统训练的模型，只能选fp32训练，保存的模型文件很大，如qwen14b,57G，推测是fp32格式的，现在llamafactory没有提供转换功能。以后可不可以加上？或者我参考网上自己写了简单代码，如下，请问不知道有没有问题，我总觉得有点不对。

Reproduction

tokenizer = AutoTokenizer.from_pretrained(base_model_path, trust_remote_code=True) model = AutoPeftModelForCausalLM.from_pretrained(lora_path, trust_remote_code=True).half().to("mps")

model = model.merge_and_unload()

tokenizer.save_pretrained(save_path) model.save_pretrained(save_path)

Expected behavior

谢谢大佬

Others

No response

lufred8341 commented 1 month ago

保存模型的意思是用mac训练lora后合并模型导出，模型文件就很大，2倍大，所以估计是fp32，不知道有什么参数选择可以输出fp16位的权重，或者我写的代码有没有问题？

hiyouga commented 1 month ago

llamafactory-cli export ... --infer_dtype float16

hiyouga / LLaMA-Factory