huggingface / optimum-quanto

A pytorch quantization backend for optimum
Apache License 2.0
776 stars 55 forks source link

Update benchmark #200

Closed dacorvo closed 4 months ago

dacorvo commented 4 months ago

What does this PR do?

Update benchmark numbers after the introduction of float16 x int4 kernels.