Closed mengts closed 4 years ago
@mengts You're right. Thanks for reporting it.
After looking at it again, I won't fix it. It is technically correct that it may saturate and the sum may use a __SSAT to avoid it.
But, the accumulator has only one guard bit which means it will saturate very quickly and the results (saturated or not) will be wrong from signal processing point of view.
This function is requiring the input to be scaled to avoid any saturation and give a correct result.
The focus of this function is speed by use of a smaller accumulator and no SSAT instructions.
The other function arm_conv_q15 is using a bigger accumulator and a SSAT instruction and is not requiring pre-scaling for this reason.
Ok, I got it. Thanks for your reply.
Hi, In arm_conv_fast_opt_q15(), it uses uses a 32-bit accumulator with 2.30 format to store intermediate value. Then, the result should be converted to 1.15 with right shift 15 bits and saturate to 1.15 as the code below. https://github.com/ARM-software/CMSIS_5/blob/8d31ab2d5d32a2234e82124d3111ac1a68be1dd1/CMSIS/DSP/Source/FilteringFunctions/arm_conv_fast_opt_q15.c#L354
However, in conv_fast_q15() as below, the result(2.30 format) directly converts to 1.15 by discard the highest bit after right shift 15 bits.
I think it might need to use same saturate method as arm_conv_fast_opt_q15() to avoid error in positive and negative. And the same situation is also in arm_correlate_fast_q15() and arm_conv_partial_fast_q15(). Thanks. https://github.com/ARM-software/CMSIS_5/blob/8d31ab2d5d32a2234e82124d3111ac1a68be1dd1/CMSIS/DSP/Source/FilteringFunctions/arm_conv_fast_q15.c#L229