Open jr0me opened 1 year ago
Thanks for reporting. Any hint on the actual instruction that generates the crash? That would greatly help tracking the origin of the issue.
Yes, sure. It is blendvps
. Called from: xsimd-8.0.5/include/xsimd/arch/xsimd_sse4_1.hpp:194
template <class A>
inline batch<float, A> xsimd::kernel::select(batch_bool<float, A> const& cond, batch<float, A> const& true_br, batch<float, A> const& false_br, requires_arch<sse4_1>) noexcept
{
return _mm_blendv_ps(false_br, true_br, cond);
}
I was wondering if the auto detection performed by xsimd takes into account this scenario and if it is the expected behaviour, perhaps due to unsupported older cpus.
This is a very unfortunate quirk of MSVC; it does run-time instruction set checking, but only on code it's assembled itself. You need to ensure you're constructing the batches after detecting the underlying architecture with xsimd::available_architectures().best
or with the forwarder described on the documentation.
As an example, for Krita I ported Vc's detection mechanism, and override default_arch
with the result of the test. See https://invent.kde.org/graphics/krita/-/blob/master/libs/multiarch/KoMultiArchBuildSupport.h#L23 and especially https://invent.kde.org/graphics/krita/-/blob/master/libs/multiarch/xsimd_extensions/config/xsimd_arch.hpp#L44.
Recently when upgrading from xsimd7 to xsimd8, I started seeing some crashes due to illegal instruction exceptions on older x64 CPUs with no SSE4.x instructions when compiling with MSVC. Unfortunately there's no way of disabling this instruction set from this compiler. I was wondering if the auto detection performed by xsimd takes into account this scenario and if it is the expected behaviour, perhaps due to unsupported older cpus.