Xilinx / brevitas

Brevitas: neural network quantization in PyTorch
https://xilinx.github.io/brevitas/
Other
1.2k stars 197 forks source link

Move DelayWrapper logic to Proxy #1023

Open Giuseppe5 opened 2 months ago

Giuseppe5 commented 2 months ago

Currently it's handled at the lowest level of integer quantization, but I argue it's not very intuitive. When a quantizer is called, it should always quantize. Then the proxy will decide whether to return the original float value (having still called the quantizer and accumulated statics/computed gradients) or the quantized value.

Pro:

Cons:

aditya-167 commented 2 months ago

Hi I would like to work on this.. I am new to brevitas

Giuseppe5 commented 1 month ago

Hello, Thanks for offering! Although the general idea is relatively clear in my mind, there are some technical details about the implementation that I still need to figure out and discuss with @nickfraser. We'll keep this thread up-to-date when there are news.