p12tic / libsimdpp

Portable header-only C++ low level SIMD library
Boost Software License 1.0
1.24k stars 129 forks source link

sse2 implementation of i_to_float32(const float64<4>& a) broken (again) #87

Closed peabody-korg closed 7 years ago

peabody-korg commented 7 years ago

The args to

return _mm_movelh_ps(r2.native(), r1.native());

are backwards. the lower 2 lanes of _mm_movelh_ps's 2nd arg are moved to the upper 2 lanes of the first arg.

Should be:

return _mm_movelh_ps(r1.native(), r2.native());
p12tic commented 7 years ago

Thanks for spotting this. I've merged your PR and added test to guard against regressions like this in 82bc8590df53ae6670610d1110b2a51c7d827228.