GPTQ for BigDL and update IPEX

henk717 / KoboldAI

KoboldAI is generative AI software optimized for fictional use, but capable of much more!

http://koboldai.com

GNU Affero General Public License v3.0

352 stars 130 forks source link

Closed Disty0 closed 5 months ago

Disty0 commented 5 months ago

Add GPTQ support.
Users have to add "disable_exllama": true to the model's quantization config file.
I tried to do this in code but BigDL keeps ignoring it.
Add whole bunch of quantization data types.
NF3 / 3-bit is the most interesting one.
Update IPEX libs.
Attention optimizations.
Lock up fixes at very big attention queries.