Add support for symmetric quantization

FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

Apache License 2.0

9.18k stars 548 forks source link

Add support for symmetric quantization #124

Closed julian-q closed 2 months ago

julian-q commented 1 year ago

This PR adds support for symmetric quantization when compressing/decompressing tensors. This is useful for comparing the performance of both symmetric and asymmetric quantization.

supports storing compressed tensors without zero point
adds some bitwise operations to support signed quantized values