qwopqwop200 / GPTQ-for-LLaMa

4 bits quantization of LLaMA using GPTQ
Apache License 2.0
2.98k stars 457 forks source link

fused mlp is sometimes not working with safetensors, add an argument for it #244

Closed DalasNoin closed 1 year ago

DalasNoin commented 1 year ago

fused mlp is sometimes not working with safetensors, no_fused_mlp is used to set fused_mlp to False, default is true

I have had the same issues as some others: https://github.com/qwopqwop200/GPTQ-for-LLaMa/issues/243