huggingface / optimum-nvidia

Apache License 2.0
894 stars 87 forks source link

Bring back quantization with Nvidia ModelOpt #147

Closed mfuntowicz closed 2 months ago

mfuntowicz commented 2 months ago

This PR restore the ability to quantize (and sparsify) models through the use of the new ModelOpt from Nvidia