CVI-SZU / Linly

Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
3.03k stars 235 forks source link

能不能提供一份量化模型int4的方法或者脚本? #49

Open doctor1984 opened 1 year ago

riverzhou commented 1 year ago

先转成llama的格式,再用llama.cpp的脚本做量化,可以量化成int4 int5 int8

doctor1984 commented 1 year ago

先转成llama的格式,再用llama.cpp的脚本做量化,可以量化成int4 int5 int8

请问bin文件转成pth的方法有脚本吗?麻烦分享一个呗,十分感谢

fengyh3 commented 1 year ago

可以参考: https://github.com/Tencent/TencentPretrain/blob/main/scripts/convert_tencentpretrain_to_llama.py

doctor1984 commented 1 year ago

可以参考: https://github.com/Tencent/TencentPretrain/blob/main/scripts/convert_tencentpretrain_to_llama.py

非常感谢,我试试哈,谢谢。

doctor1984 commented 1 year ago

可以参考: https://github.com/Tencent/TencentPretrain/blob/main/scripts/convert_tencentpretrain_to_llama.py

P01son/Linly-ChatFlow-13B他这个模型转换int4后依然不能使用