Kolors是否可以加载量化版的chatglm3 TextEncoder？

modelscope / DiffSynth-Studio

Enjoy the magic of Diffusion models!

Apache License 2.0

6.4k stars 575 forks source link

Closed JoshonSmith closed 1 month ago

JoshonSmith commented 2 months ago

类似 https://github.com/Kwai-Kolors/Kolors/issues/39 采用 ChatGLM3-6B-Base 官方的方式，模型加载换成 .quantize(4) 参数，显存立马占用少了 5 - 6GB

Artiprocher commented 2 months ago

好的，我们晚点支持这个特性

Artiprocher commented 2 months ago

pip install cpm_kernels

安装量化所需的额外库，再

pipe.text_encoder_kolors = pipe.text_encoder_kolors.quantize(4)
torch.cuda.empty_cache()

进行量化