huggingface / optimum-quanto

A pytorch quantization backend for optimum
Apache License 2.0
833 stars 61 forks source link

More refactoring #316

Closed dacorvo closed 2 months ago

dacorvo commented 2 months ago

What does this PR do?

This is another pull-request that does some refactoring before the introduction of the Marlin int4 kernel.