I ran into an issue where NEGEMMLowpMatrixMultiplyCore produces very large INT32 output when the zero_point values of its input matrices are > 0. As a result, when NEGEMMLowpOutputStage takes the INT32 values and performs fixed-point multiplication, the result will overflow (i.e. > 255).
The zero_point values of input matrices can be > 0 when we quantize negative FLOAT32 numbers to QASYMM8.
I think the root cause of the issue is the interpretation of zero_point's signedness. Here it states that offsets are to be added. Instead based on a basic derivation of arithmetics, they should be subtracted:
May I know why the offset values are being added, or is there something else that I missed? Here's the comment in code:
* -# Convert a values from QASYMM8 to int32 and add a_offset to each of them.
* -# Convert b values from QASYMM8 to int32 add b_offset to each of them.
* -# Compute the matrix product of the resulting a * b in int32.
To see the issue, simply use this example and change the following random generator from:
Hi,
I ran into an issue where
NEGEMMLowpMatrixMultiplyCore
produces very large INT32 output when the zero_point values of its input matrices are > 0. As a result, whenNEGEMMLowpOutputStage
takes the INT32 values and performs fixed-point multiplication, the result will overflow (i.e. > 255).The zero_point values of input matrices can be > 0 when we quantize negative FLOAT32 numbers to QASYMM8.
I think the root cause of the issue is the interpretation of zero_point's signedness. Here it states that offsets are to be added. Instead based on a basic derivation of arithmetics, they should be subtracted:
May I know why the offset values are being added, or is there something else that I missed? Here's the comment in code:
To see the issue, simply use this example and change the following random generator from:
to:
Run the test:
Note: If I change the "addition" to "subtraction" in the equation, then I can get the correct result in the above example.
Thanks!