huggingface / optimum-quanto

A pytorch quantization backend for optimum
Apache License 2.0
833 stars 62 forks source link

Stricter optimized tensor tests #320

Closed dacorvo closed 2 months ago

dacorvo commented 2 months ago

What does this PR do?

This makes sure QTensor linear operations using optimized kernels are giving the same results as those using dequantized weights.