Closed oscarfv closed 4 years ago
It was just a performance issue, some iteration intensive benchmarks are unfortunately quite a bit slower without this :(
The second part is a leftover from refactoring, thanks for finding. I've now removed the if
I could add a define that allows to disable the intrinsics and use a fallback, that shouldn't be too difficult. Would this work for you?
I could add a define that allows to disable the intrinsics and use a fallback, that shouldn't be too difficult. Would this work for you?
I'm not sure about the correct solution for the problem I'm seeing on MinGW-w64, but an option for disabling intrinsics is useful in general. There are ports of gcc with limited intrinsics support.
For the record:
#include <x86intrin.h>
unsigned f() {
return _tzcnt_u32(42);
}
$ g++ rh.cpp
In file included from c:/apps/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/10.2.0/include/immintrin.h:107,
from c:/apps/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/10.2.0/include/x86intrin.h:32,
from rh.cpp:9:
c:/apps/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/10.2.0/include/bmiintrin.h: In function 'unsigned int f()':
c:/apps/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/10.2.0/include/bmiintrin.h:104:1: error: inlining failed in call to 'always_inline' 'unsigned int _tzcnt_u32(unsigned int)': target specific option mismatch
104 | _tzcnt_u32 (unsigned int __X)
| ^~~~~~~~~~
rh.cpp:12:20: note: called from here
12 | return _tzcnt_u32(42);
| ~~~~~~~~~~^~~~
$ g++ -v
Using built-in specs.
COLLECT_GCC=c:\apps\msys64\mingw64\bin\g++.exe
COLLECT_LTO_WRAPPER=c:/apps/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/10.2.0/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../gcc-10.2.0/configure --prefix=/mingw64 --with-local-prefix=/mingw64/local --build=x86_64-w64-mingw32 --host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --with-native-system-header-dir=/mingw64/x86_64-w64-mingw32/include --libexecdir=/mingw64/lib --enable-bootstrap --with-arch=x86-64 --with-tune=generic --enable-languages=c,lto,c++,fortran,ada,objc,obj-c++ --enable-shared --enable-static --enable-libatomic --enable-threads=posix --enable-graphite --enable-fully-dynamic-string --enable-libstdcxx-filesystem-ts=yes --enable-libstdcxx-time=yes --disable-libstdcxx-pch --disable-libstdcxx-debug --disable-isl-version-check --enable-lto --enable-libgomp --disable-multilib --enable-checking=release --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --disable-plugin --with-libiconv --with-system-zlib --with-gmp=/mingw64 --with-mpfr=/mingw64 --with-mpc=/mingw64 --with-isl=/mingw64 --with-pkgversion='Rev1, Built by MSYS2 project' --with-bugurl=https://github.com/msys2/MINGW-packages/issues --with-gnu-as --with-gnu-ld
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.2.0 (Rev1, Built by MSYS2 project)
The error does not happen if I use -march=native
on my Intel Skylake box, but other options like -mavx2
does not prevent it. Of course, using -march
restricts the machines that can run your executable. That's something to consider also: I'm not sure which machines support which intrinsics even on Intel.
Clang/Mingw-w64 compiles this code right away.
You can now define ROBIN_HOOD_DISABLE_INTRINSICS
before including robin_hood.h to disable intrinsics (currently only on master, not yet released)
Thank you.
What was the reason for reverting https://github.com/martinus/robin-hood-hashing/commit/f45a309af7d08cb98fc08dc7903ac9a7091db6d5 ? Any reliability concerns or just performance?
I'm asking because, on certain contexts, including
x86intrin.h
on Mingw-w64 triggers a bug on g++.