gnuradio / volk

The Vector Optimized Library of Kernels
http://libvolk.org
GNU Lesser General Public License v3.0
537 stars 202 forks source link

volk_32f_invsqrt_32f is untested, may be buggy #687

Open argilo opened 10 months ago

argilo commented 10 months ago

Currently there is no test for volk_32f_invsqrt_32f, and it doesn't look like there's been one in the past.

I tried adding the following line to kernel_tests.h:

QA(VOLK_INIT_TEST(volk_32f_invsqrt_32f, test_params_inacc))

But when running the test in a loop (ctest -R invsqrt --output-on-failure --repeat until-fail:100000), it soon fails:

Test project /home/argilo/git/volk/build
    Start 40: qa_volk_32f_invsqrt_32f
1/1 Test #40: qa_volk_32f_invsqrt_32f ..........***Failed    0.06 sec
RUN_VOLK_TESTS: volk_32f_invsqrt_32f(131071,1)
/home/argilo/git/volk/kernels/volk/volk_32f_invsqrt_32f.h:71:22: runtime error: signed integer overflow: 1597463007 - -550196562 cannot be represented in type 'int'
a_avx completed in 0.43251 ms
/home/argilo/git/volk/kernels/volk/volk_32f_invsqrt_32f.h:71:22: runtime error: signed integer overflow: 1597463007 - -550057730 cannot be represented in type 'int'
a_sse completed in 0.284684 ms
/home/argilo/git/volk/kernels/volk/volk_32f_invsqrt_32f.h:71:22: runtime error: signed integer overflow: 1597463007 - -552585760 cannot be represented in type 'int'
generic completed in 3.31004 ms
/home/argilo/git/volk/kernels/volk/volk_32f_invsqrt_32f.h:71:22: runtime error: signed integer overflow: 1597463007 - -550196562 cannot be represented in type 'int'
u_avx completed in 0.356173 ms
offset 72267 in1: 1.98178e+19 in2: inf tolerance was: 0.01
volk_32f_invsqrt_32f: fail on arch a_avx
offset 72267 in1: 1.98178e+19 in2: inf tolerance was: 0.01
volk_32f_invsqrt_32f: fail on arch a_sse
offset 72267 in1: 1.98178e+19 in2: inf tolerance was: 0.01
volk_32f_invsqrt_32f: fail on arch u_avx
argilo commented 10 months ago

Related: #686

Ka-zam commented 10 months ago

I looked at this as well and I wrote a new kernel based on the rsqrt() intrinsic. It's faster and more accurate than this kernel.

Should I submit a PR for this?

jdemel commented 10 months ago

The invsqrt kernel needs to stay until we do another major release because we follow semver and can't just remove a kernel. We can remove a specific implementation of a kernel though.

@Ka-zam your rsqrt implementation would probably fit best in the invsqrt kernel.