fairy-stockfish / Fairy-Stockfish

chess variant engine supporting Xiangqi, Shogi, Janggi, Makruk, S-Chess, Crazyhouse, Bughouse, and many more
https://fairy-stockfish.github.io/
GNU General Public License v3.0
607 stars 189 forks source link

Crashes on fishtest #29

Closed ianfab closed 5 years ago

ianfab commented 5 years ago

Recent tests on fishtest show occasional crashes on all workers for various variants. I was not able to reproduce it locally so far, so if anyone managed to find a reproducible bug or get a stacktrace, please let me know.

ianfab commented 5 years ago

While searching for a possible cause I fixed a bug in https://github.com/ianfab/Fairy-Stockfish/commit/05d1624cd740796c91ca4fb7822c8efd0c04113a, but that should be unrelated to this issue. The issue itself is still difficult to reproduce.

ianfab commented 5 years ago

Testing suggests that these crashes could be stack overflows caused by the increased stack size from the extension of the move list in https://github.com/ianfab/Fairy-Stockfish/commit/96332612da08fef1a7cb50af3b0cefdb6006f93e, see this failing build in appveyor.

I will push a commit to increase stack size for windows and macOSX. @ppigazzini This should hopefully also fix the crashes on your windows worker.

ppigazzini commented 5 years ago

@ianfab Let me know if you need some tests on Windows.

ianfab commented 5 years ago

@ppigazzini Thanks, since testing with appveyor clearly showed that stack size was the issue, I consider this to be resolved. I will let you know when all tests of the problematic versions have finished on fishtest, since after that windows workers should run stably again. Then it would be nice if you could add back the worker for a while just to confirm that this is the case.

ppigazzini commented 5 years ago

@ianfab the windows worker is collecting some crashes: http://35.161.250.236:6543/tests/view/5d565f346e23db34f4206d9d http://35.161.250.236:6543/tests/view/5d546ebf6e23db34f4206d97 http://35.161.250.236:6543/tests/view/5d595a136e23db247f6b64ff

I'm using g++ built by MinGW-W64 project to avoid this msys2 MinGW-W64 problem

g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=C:/msys64/mingw64-810/bin/../libexec/gcc/x86_64-w64-mingw32/8.1.0/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../../../src/gcc-8.1.0/configure --host=x86_64-w64-mingw32 --build=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --prefix=/mingw64 --with-sysroot=/c/mingw810/x86_64-810-posix-seh-rt_v6-rev0/mingw64 --enable-shared --enable-static --disable-multilib --enable-languages=c,c++,fortran,lto --enable-libstdcxx-time=yes --enable-threads=posix --enable-libgomp --enable-libatomic --enable-lto --enable-graphite --enable-checking=release --enable-fully-dynamic-string --enable-version-specific-runtime-libs --disable-libstdcxx-pch --disable-libstdcxx-debug --enable-bootstrap --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --with-gnu-as --with-gnu-ld --with-arch=nocona --with-tune=core2 --with-libiconv --with-system-zlib --with-gmp=/c/mingw810/prerequisites/x86_64-w64-mingw32-static --with-mpfr=/c/mingw810/prerequisites/x86_64-w64-mingw32-static --with-mpc=/c/mingw810/prerequisites/x86_64-w64-mingw32-static --with-isl=/c/mingw810/prerequisites/x86_64-w64-mingw32-static --with-pkgversion='x86_64-posix-seh-rev0, Built by MinGW-W64 project' --with-bugurl=https://sourceforge.net/projects/mingw-w64 CFLAGS='-O2 -pipe -fno-ident -I/c/mingw810/x86_64-810-posix-seh-rt_v6-rev0/mingw64/opt/include -I/c/mingw810/prerequisites/x86_64-zlib-static/include -I/c/mingw810/prerequisites/x86_64-w64-mingw32-static/include' CXXFLAGS='-O2 -pipe -fno-ident -I/c/mingw810/x86_64-810-posix-seh-rt_v6-rev0/mingw64/opt/include -I/c/mingw810/prerequisites/x86_64-zlib-static/include -I/c/mingw810/prerequisites/x86_64-w64-mingw32-static/include' CPPFLAGS=' -I/c/mingw810/x86_64-810-posix-seh-rt_v6-rev0/mingw64/opt/include -I/c/mingw810/prerequisites/x86_64-zlib-static/include -I/c/mingw810/prerequisites/x86_64-w64-mingw32-static/include' LDFLAGS='-pipe -fno-ident -L/c/mingw810/x86_64-810-posix-seh-rt_v6-rev0/mingw64/opt/lib -L/c/mingw810/prerequisites/x86_64-zlib-static/lib -L/c/mingw810/prerequisites/x86_64-w64-mingw32-static/lib '
Thread model: posix
gcc version 8.1.0 (x86_64-posix-seh-rev0, Built by MinGW-W64 project)
ianfab commented 5 years ago

@ppigazzini Sorry if my comment was misleading, it is expected that the windows workers still crash with the currently running tests since they are using versions before the fix, so only after those tests are finished and new tests with versions after the fix are started, no crashes are expected any more. I will submit new tests with the fixed versions in the evening (CET).

However, now that you mentioned mingw/gcc on windows I realized that this case presumably is not covered yet, since my changes only increase stack size for compilation with MSVC as well as on macOS. Maybe you could try to modify the custom_make.txt of your worker and explicitly pass the stack size, e.g., using -Wl,--stack,8388608 as an argument to g++. I should also cover this scenario in the makefile.

ianfab commented 5 years ago

@ppigazzini How does your make command look like? I noticed that it is actually not easy to inject the additional argument via the make command without overriding other flags.

https://github.com/ianfab/Fairy-Stockfish/commit/2e145233a223336e7a0a2e276793f2183db63881 should address the issue. It would be nice if you could run two comparable matches with and without the changed makefile to check whether it fixes the crashes. From the tests on fishtest it seems that the variant does not matter much, but for hoppelpoppel and asean the crashes were most frequent.

ppigazzini commented 5 years ago

@ianfab I'm overwriting the CXXFLAGS:

make profile-build ARCH=x86-64-modern COMP=mingw CXXFLAGS='-Wall -Wcast-qual -fno-exceptions -std=c++11 -fprofile-generate -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -msse3 -mpopcnt -DUSE_POPCNT -Wl,--stack,8388608' -j

I will test https://github.com/ianfab/Fairy-Stockfish/commit/2e145233a223336e7a0a2e276793f2183db63881 in the evening, some (real) homekeeping to do now :)

ppigazzini commented 5 years ago

@ianfab

ianfab commented 5 years ago

@ppigazzini Thanks, the results sound quite conclusive, so I will merge it.

I just wonder whether there is a better way to maintain the currently required stack size than having it in three different places of the code for macOS, MSVC, and mingw. Maybe making it a default value in the makefile would be an option, but since I anyway do not expect any further change any time soon, it does not really matter much.

ianfab commented 5 years ago

After the merge of the fix from official stockfish, I reverted my workaround, since it should not be required any more.