The width of the thresholds are based on the width of the input datatype and threshold datatype. The input datatype is in most cases coming from an MVU, which will have by default a 32-bit output datatype. Since the MinimizeAccumulatorWidth is applied iteratively in a loop, the output datatype of the MVU is optimized/minimized before the Thresholding_Batch layer is considered. However, we need to run the InferDataTypes transformation to propagate this information such that the succeeding Thresholding_Batch knows of the minimized bit-width of the input (and does not default to 32-bit thresholds).
The width of the thresholds are based on the width of the input datatype and threshold datatype. The input datatype is in most cases coming from an
MVU
, which will have by default a 32-bit output datatype. Since theMinimizeAccumulatorWidth
is applied iteratively in a loop, the output datatype of theMVU
is optimized/minimized before theThresholding_Batch
layer is considered. However, we need to run theInferDataTypes
transformation to propagate this information such that the succeedingThresholding_Batch
knows of the minimized bit-width of the input (and does not default to 32-bit thresholds).