Xilinx / finn

Dataflow compiler for QNN inference on FPGAs
https://xilinx.github.io/finn
BSD 3-Clause "New" or "Revised" License
708 stars 225 forks source link

Fix to threshold's width optimization #866

Closed mmrahorovic closed 1 year ago

mmrahorovic commented 1 year ago

The width of the thresholds are based on the width of the input datatype and threshold datatype. The input datatype is in most cases coming from an MVU, which will have by default a 32-bit output datatype. Since the MinimizeAccumulatorWidth is applied iteratively in a loop, the output datatype of the MVU is optimized/minimized before the Thresholding_Batch layer is considered. However, we need to run the InferDataTypes transformation to propagate this information such that the succeeding Thresholding_Batch knows of the minimized bit-width of the input (and does not default to 32-bit thresholds).