Purpose of WithQuantization()

budgetdevv commented 6 days ago

What is the purpose of this API? Do I need to use it when running a quantized GGUF model? Thanks

DarthAffe commented 6 days ago

Hey, the short anwser is: There is no real use to this, don't use it.

The longer version of that is, that this method loads the weights of the model quantized in the given format. (It quantizes on the fly when loading without it beeing saved in this format - this was required in the beginning of stable-diffusion.cpp, when saving quantized models wasn't possible). But due to the long time needed to load a model with this setting, I don't really see any use case where this is better than just converting it before.

budgetdevv commented 6 days ago

Hey thanks for the prompt response! I was wondering why it took so long

DarthAffe / StableDiffusion.NET

Purpose of WithQuantization() #31