能不能提供一份量化模型int4的方法或者脚本？

CVI-SZU / Linly

Chinese-LLaMA 1&2、Chinese-Falcon 基础模型；ChatFlow中文对话模型；中文OpenLLaMA模型；NLP预训练/指令微调数据集

3.03k stars 235 forks source link

Open doctor1984 opened 1 year ago

riverzhou commented 1 year ago

先转成llama的格式，再用llama.cpp的脚本做量化，可以量化成int4 int5 int8

doctor1984 commented 1 year ago

先转成llama的格式，再用llama.cpp的脚本做量化，可以量化成int4 int5 int8

请问bin文件转成pth的方法有脚本吗？麻烦分享一个呗，十分感谢

fengyh3 commented 1 year ago

doctor1984 commented 1 year ago

可以参考： https://github.com/Tencent/TencentPretrain/blob/main/scripts/convert_tencentpretrain_to_llama.py

非常感谢，我试试哈，谢谢。

doctor1984 commented 1 year ago

可以参考： https://github.com/Tencent/TencentPretrain/blob/main/scripts/convert_tencentpretrain_to_llama.py

P01son/Linly-ChatFlow-13B他这个模型转换int4后依然不能使用