Closed Sniper296 closed 2 years ago
Trying with WSL2: Cross compiling still compiles fine like Msys2, but still segfaults with those two implementations. Compiling natively within WSL2 compiles and runs fine (and faster than the implementations that don't segfault).
Still, native Windows would be preferable to avoid the slight CPU overhead and moderate RAM overhead of WSL2.
I think for windows, amd64-51-30k|amd64-64-24k
never worked correctly & donna makes more sense. probably something to do with call ABI differences or something in these lines. that's why I've always built windows stuff with donna.
i think ive just fixed it in https://github.com/cathugger/mkp224o/commit/4e20f086e3d0f8aa881afaa891eb8759b87fb174
Yep. Now it runs the same speed as under WSL (~2x donna) without the ~50x RAM usage!
For any Windows users, until #38 is implemented, on an i7-2600k the best performance I get comes from:
-march=native -O3 -fno-plt -flto -fomit-frame-pointer
./configure --enable-regex --with-pcre2="[mingw64|clang64]/bin/pcre2-config" --enable-amd64-64-24k --enable-intfilter --enable-binsearch --enable-besort
I see no appreciable performance difference between GCC and Clang.
On both d5b90d43a9d0783c4b4c1ff2c3f0c7b359fd663f and 7f714ee4f7cbdada158314a92ebda1f0433f048e, I get a segfault with either the amd64-51-30k or the amd64-64-24k implementations. ref10, donna and donna-sse2 all work fine.
Environment:
Msys2 (either clang64 or gcc64) MINGW64_NT-10.0-19044 gcc version 11.3.0 (Rev1, Built by MSYS2 project) clang version 14.0.3 libsodium 1.0.18-2
CC="clang"
CFLAGS="-march=native -Og -g -pipe -fomit-frame-pointer"
MAKEFLAGS="-j$(nproc)"
Build steps:
make clean
./autogen.sh
./configure --enable-regex --with-pcre2="/clang64/bin/pcre2-config" --enable-[amd64-51-30k|amd64-64-24k] --enable-intfilter [--enable-binsearch --enable-besort]
make
Result:
gdb --args ./mkp224o.exe -B -s -T test
Yet to try with wsl2.