Per-channel zero points but per-tensor scales

Xilinx / brevitas

Brevitas: neural network quantization in PyTorch

https://xilinx.github.io/brevitas/

Other

1.14k stars 189 forks source link

Per-channel zero points but per-tensor scales #929

Open V0XNIHILI opened 4 months ago

V0XNIHILI commented 4 months ago

How would I go about implementing the above feature in Brevitas? Where should I get started and is this even possible in the way Brevitas is currently set up?

V0XNIHILI commented 4 months ago

@Giuseppe5 do you know more about this?

Giuseppe5 commented 4 months ago

Yes, apologies for the late reply. Later tomorrow I will publish an example of how to achieve this, but in general it should be possible since zero point and scale have independent shapes that can be individually specified when instantiating a new quantizer.

V0XNIHILI commented 3 months ago

@Giuseppe5 if you have the time, I would still really love to see how we can achieve this with Brevitas 😁

Giuseppe5 commented 2 months ago

Apologies for the delay, but it is a bit more complicated than I originally thought.

It is still possible but it requires to play a bit more with dependency injection. I'll try to look into it but not sure when at this point. Apologies.

V0XNIHILI commented 2 months ago

No problem! Do you have a reference for me on where to start from/where to look?

Giuseppe5 commented 2 months ago

You need to create a quantizer where:

scaling_shape is scalar shape (ref here: https://github.com/Xilinx/brevitas/blob/6bdb1f88b2f475d9618e0fd49c331bd11f7131f0/src/brevitas/quant/solver/parameter.py#L111)
The quantizer inherits from this class: https://github.com/Xilinx/brevitas/blob/200456825f3b4b8db414f2b25b64311f82d3991a/src/brevitas/quant/solver/weight.py#L36
The zero point shape is equal to: https://github.com/Xilinx/brevitas/blob/200456825f3b4b8db414f2b25b64311f82d3991a/src/brevitas/quant/solver/weight.py#L53C9-L53C41