xtensor-stack / xsimd

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
https://xsimd.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
2.16k stars 253 forks source link

Auto detection on MSVC 64bit issue with older CPUs #862

Open jr0me opened 1 year ago

jr0me commented 1 year ago

Recently when upgrading from xsimd7 to xsimd8, I started seeing some crashes due to illegal instruction exceptions on older x64 CPUs with no SSE4.x instructions when compiling with MSVC. Unfortunately there's no way of disabling this instruction set from this compiler. I was wondering if the auto detection performed by xsimd takes into account this scenario and if it is the expected behaviour, perhaps due to unsupported older cpus.

serge-sans-paille commented 1 year ago

Thanks for reporting. Any hint on the actual instruction that generates the crash? That would greatly help tracking the origin of the issue.

jr0me commented 1 year ago

Yes, sure. It is blendvps. Called from: xsimd-8.0.5/include/xsimd/arch/xsimd_sse4_1.hpp:194

template <class A>
inline batch<float, A> xsimd::kernel::select(batch_bool<float, A> const& cond, batch<float, A> const& true_br, batch<float, A> const& false_br, requires_arch<sse4_1>) noexcept
{
    return _mm_blendv_ps(false_br, true_br, cond);
}
amyspark commented 1 year ago

I was wondering if the auto detection performed by xsimd takes into account this scenario and if it is the expected behaviour, perhaps due to unsupported older cpus.

This is a very unfortunate quirk of MSVC; it does run-time instruction set checking, but only on code it's assembled itself. You need to ensure you're constructing the batches after detecting the underlying architecture with xsimd::available_architectures().best or with the forwarder described on the documentation.

As an example, for Krita I ported Vc's detection mechanism, and override default_arch with the result of the test. See https://invent.kde.org/graphics/krita/-/blob/master/libs/multiarch/KoMultiArchBuildSupport.h#L23 and especially https://invent.kde.org/graphics/krita/-/blob/master/libs/multiarch/xsimd_extensions/config/xsimd_arch.hpp#L44.