While looking for flaky tests with ctest --repeat until-fail:<n>, the qa_volk_8ic_x2_multiply_conjugate_16ic test turned out to be flaky. Failures are rare, but can be induced with the following change to the test framework:
diff --git a/lib/qa_utils.cc b/lib/qa_utils.cc
index 4be7b8a..21a1e34 100644
--- a/lib/qa_utils.cc
+++ b/lib/qa_utils.cc
@@ -99,7 +99,7 @@ void load_random_data(void* data, volk_type_t type, unsigned int n)
if (type.is_signed) {
std::uniform_int_distribution<int16_t> uniform_dist(
std::numeric_limits<int8_t>::min(),
- std::numeric_limits<int8_t>::max());
+ -127);
for (unsigned int i = 0; i < n; i++)
((int8_t*)data)[i] = uniform_dist(rnd_engine);
} else {
The root cause is that the corner case aVector[i] = -128-128j, bVector[i] = -128-128j produces an output of 32768+0j, and the real part is too large to fit in a signed 16-bit integer. The SIMD protokernels use saturation arithmetic (producing 32767+0j, while the generic protokernel (and tail processing in the other protokernels) overflows and produces -32768+0j. To fix the problem, I've switched to saturation arithmetic everywhere, which seems more appropriate for a DSP context, and avoids Undefined Behaviour.
Only the real part of the result needs to be clamped, because the imaginary part cannot overflow.
While looking for flaky tests with
ctest --repeat until-fail:<n>
, theqa_volk_8ic_x2_multiply_conjugate_16ic
test turned out to be flaky. Failures are rare, but can be induced with the following change to the test framework:The root cause is that the corner case
aVector[i] = -128-128j
,bVector[i] = -128-128j
produces an output of32768+0j
, and the real part is too large to fit in a signed 16-bit integer. The SIMD protokernels use saturation arithmetic (producing32767+0j
, while the generic protokernel (and tail processing in the other protokernels) overflows and produces-32768+0j
. To fix the problem, I've switched to saturation arithmetic everywhere, which seems more appropriate for a DSP context, and avoids Undefined Behaviour.Only the real part of the result needs to be clamped, because the imaginary part cannot overflow.