Open philipturner opened 1 year ago
For smaller models, quantization causes more quality loss than large models. Could the repository try 6-bit / 128 groups for stuff like LLaMa-7B? This could be most useful for some of the smaller language networks in Stable Diffusion.
Yes.. 6b would work great for 13b and below to make the model smarter.
For smaller models, quantization causes more quality loss than large models. Could the repository try 6-bit / 128 groups for stuff like LLaMa-7B? This could be most useful for some of the smaller language networks in Stable Diffusion.