InternLM / InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Apache License 2.0
2.47k stars 153 forks source link

InternLM-XComposer2-VL-7B使用lora微调后,如何量化得到int4版的模型用于推理? #208

Closed iFe1er closed 6 months ago

iFe1er commented 6 months ago

看了auto-gptq文档,这里的模型量化需要数据来进行模型的量化训练,请问具体需要使用哪些数据,是纯文本还是图像+文本混合,有具体的方法和代码实现吗?谢谢!

iFe1er commented 6 months ago

@yhcao6 @panzhang0212 求助~谢谢

LightDXY commented 6 months ago

hi,我们使用的auto-gptq默认的量化方法,没有引入量化训练,https://github.com/AutoGPTQ/AutoGPTQ?tab=readme-ov-file#quick-tour

iFe1er commented 6 months ago

可是在https://github.com/AutoGPTQ/AutoGPTQ?tab=readme-ov-file#quick-tour中,AutoGPTQ的文档说到: warning: this is just a showcase of the usage of basic apis in AutoGPTQ, which uses only one sample to quantize a much small model, quality of quantized model using such little samples may not good.

不使用数据/极少数据 进行量化训练可能会使模型量化的质量变差 @LightDXY