gnuradio / volk

The Vector Optimized Library of Kernels
http://libvolk.org
GNU Lesser General Public License v3.0
557 stars 202 forks source link

Avoid integer overflow in volk_8ic_x2_multiply_conjugate_16ic corner case #701

Closed argilo closed 12 months ago

argilo commented 1 year ago

While looking for flaky tests with ctest --repeat until-fail:<n>, the qa_volk_8ic_x2_multiply_conjugate_16ic test turned out to be flaky. Failures are rare, but can be induced with the following change to the test framework:

diff --git a/lib/qa_utils.cc b/lib/qa_utils.cc
index 4be7b8a..21a1e34 100644
--- a/lib/qa_utils.cc
+++ b/lib/qa_utils.cc
@@ -99,7 +99,7 @@ void load_random_data(void* data, volk_type_t type, unsigned int n)
             if (type.is_signed) {
                 std::uniform_int_distribution<int16_t> uniform_dist(
                     std::numeric_limits<int8_t>::min(),
-                    std::numeric_limits<int8_t>::max());
+                    -127);
                 for (unsigned int i = 0; i < n; i++)
                     ((int8_t*)data)[i] = uniform_dist(rnd_engine);
             } else {

The root cause is that the corner case aVector[i] = -128-128j, bVector[i] = -128-128j produces an output of 32768+0j, and the real part is too large to fit in a signed 16-bit integer. The SIMD protokernels use saturation arithmetic (producing 32767+0j, while the generic protokernel (and tail processing in the other protokernels) overflows and produces -32768+0j. To fix the problem, I've switched to saturation arithmetic everywhere, which seems more appropriate for a DSP context, and avoids Undefined Behaviour.

Only the real part of the result needs to be clamped, because the imaginary part cannot overflow.

argilo commented 1 year ago

The test failure is due to #663.