fdtc: Fix overflow on NEON

webmproject / sjpeg

SimpleJPEG: simple jpeg encoder

Apache License 2.0

74 stars 13 forks source link

fdtc: Fix overflow on NEON #129

Closed same-denik closed 4 months ago

same-denik commented 4 months ago

Large coefficients can cause an int16_t overflow in FDTC calculation on NEON. The fix replaces the last unsafe butterfly operation with vector multiplication add/sub instructions with int32_t outputs.

Fixes #128

jzern commented 4 months ago

Thanks for the patch.