OpenMOSS / MOSS

An open-source tool-augmented conversational language model from Fudan University
https://txsun1997.github.io/blogs/moss.html
Apache License 2.0
11.92k stars 1.14k forks source link

How to convert a finetuned MOSS model to quantized version model? 请问如何把一个finetune过的MOSS模型转换为量化版的模型呢? #245

Open qgpmztmf opened 1 year ago

qgpmztmf commented 1 year ago

I couldn't find the code to release this process in this repository. Has anyone successfully converted a finetuned MOSS model to its quantized version? If so, could you please share the steps or code used to achieve this? 没找到实现这个过程的代码,有谁成功把finetune过的moss模型转换成量化版本的模型吗?

qgpmztmf commented 1 year ago

.

JIEKEXIAN commented 1 year ago

我测了他们的int4,发现量化后的还没有量化前的推理速度快。

qgpmztmf commented 1 year ago

我测了他们的int4,发现量化后的还没有量化前的推理速度快。

量化并不一定会提速,量化主要是为了缩小模型占用显存。