HighCWu / flux-4bit

MIT License
17 stars 3 forks source link

Is there any quantification code available? #4

Open CrushDemo01 opened 1 month ago

CrushDemo01 commented 1 month ago

Hello, thank you for your work, it helps me a lot.

But the current video memory still puts me under pressure.

I wonder if it can be further quantified, such as 2bit? Can you provide a reference to the quantization code you used?

Thank you!

HighCWu commented 1 month ago

I think you can try lllyasviel/stable-diffusion-webui-forge, this repository has made many attempts to make your 4-bit model run in lower video memory. Believe me, 2-bit quantization is definitely not good.

CrushDemo01 commented 1 month ago

Thank you for your positive reply. I have also seen stable-diffusion-webui-forge, but it seems not completely in line with my idea. I still hope to try a quantitative approach and at least further compress T5. I have tried some quantization interfaces provided in libraries such as hqq and transformers, but none of them were successful.