Closed vondele closed 4 years ago
some tasks needed for integration:
debug=yes
build init_nnue()
(startup, EvalFile, Use NNUE, is_ready, cmdline, ucinewgame, ....), add assert.custom_make.txt
Let's use this issue specifically for synchronizing work on these or related tasks. General discussion use https://github.com/official-stockfish/Stockfish/issues/2728
debug=yes
build is broken it seems.
A quick verification / benchmark (single bench run) on the targets I can test easily (on zen2 architecture), gives a feeling for the order of magnitudes. Good thing is all benches match. If people can test x86-64-avx512 armv7 armv8 ppc-32 ppc-64 that would be appreciated.
target | nodes | nodes NNUE | nps | nps NNUE |
---|---|---|---|---|
general-32 | 4578298 | 3377227 | 1725705 | 134486 |
x86-32-old | 4578298 | 3377227 | 1783520 | 133497 |
x86-32 | 4578298 | 3377227 | 1758178 | 133635 |
general-64 | 4578298 | 3377227 | 2423662 | 308000 |
x86-64 | 4578298 | 3377227 | 2485503 | 436108 |
x86-64-modern | 4578298 | 3377227 | 2611693 | 424648 |
x86-64-sse3 | 4578298 | 3377227 | 2470749 | 1131778 |
x86-64-sse3-popcnt | 4578298 | 3377227 | 2675802 | 1199299 |
x86-64-ssse3 | 4578298 | 3377227 | 2577870 | 1166975 |
x86-64-sse41 | 4578298 | 3377227 | 2445672 | 1243914 |
x86-64-sse42 | 4578298 | 3377227 | 2549163 | 1113126 |
x86-64-avx2 | 4578298 | 3377227 | 2678933 | 1635461 |
x86-64-bmi2 | 4578298 | 3377227 | 1906829 | 1411884 |
Edit: since 'x86-64-modern' on master enables both sse3 and popcnt already, we can easily adjust that one (actually, incorrect, needs ssse3 not sse3, the target x86-64-sse3 enables both, increased modern to require ssse3).
Fix for debug build: PR #2827.
Some more tasks as seen from the CI (patches welcome):
* clang: nnue/evaluate_nnue.cpp:60:45: error: no member named 'aligned_alloc' in namespace 'std'
Compiler bug. std::aligned_alloc is standard C++17.
or header missing?
hmm. std::aligned_alloc
should exist after #include <cstdlib>
which exists in "types.h"
, which is included from "position.h"
, which is included from nnue/evaluate_nnue.cpp
...
OK, needs clang 7.0 (https://godbolt.org/z/rE3joM), CI is still based on 6.0. .travis.yml needs updating gcc needs to be >= 7.4
I have add a commit to upgrade the travis settings on Linux... probably somebody needs to come with a similar fix for the osx of our CI.
Edit: seems like this upgrade didn't work. I'll revert that patch and we'll address the issue later
@vondele
I'm afraid that assemblies for the x86_64 and x86_32 architecture (without sse3 or popcnt) are useless for testing in the framework, since they are much slower than the classic Stockfish assemblies for the same architecture and are much weaker. Testing them can significantly affect the quality of the tests.
almost certainly all machines on fishtest support x86-64-modern (which is right now equivalent to x86-64-sse3-popcnt) on the branch. The majority of the machines support avx2 or bmi2. We'll have to deal with that eventually, but right now it is not too much of a concern.
On compiling for macOS
On macOS 10.15.6 (Catalina) make build ARCH=x86-64-bmi2 COMP=clang
has two issues:
The first one: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/malloc/_malloc.h:50:10: note: 'aligned_alloc' has been marked as being introduced in macOS 10.15 here, but the deployment target is macOS 10.9.0
So we need to change -mmacosx-version-min=10.9
in the Makefile to -mmacosx-version-min=10.15
but according to statcounter only ~52% of Mac users are on 10.15.
If we do change the min version, second issue:
nnue/evaluate_nnue.cpp:60:40: error: no member named 'aligned_alloc' in namespace 'std'; did you mean simply 'aligned_alloc'?
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/malloc/_malloc.h:50:10: note: 'aligned_alloc' declared here
It seems like the C++ library that ships with Xcode dev tools has aligned_alloc in the global namespace. Could maybe get around this with a #if __APPLE__
.
Fix for memset/memcpy warnings and valgrind issue: PR #2830.
I can't see unfreed memory issue. Could you check again after applying these fixes?
error on build
$ make build ARCH=x86-64 COMP=mingw
Config:
debug: 'no'
sanitize: 'no'
optimize: 'yes'
arch: 'x86_64'
bits: '64'
kernel: 'MINGW64_NT-10.0-18363'
os: 'Windows_NT'
prefetch: 'yes'
popcnt: 'no'
sse: 'yes'
sse3: 'no'
ssse3: 'no'
sse41: 'no'
sse42: 'no'
avx2: 'no'
pext: 'no'
avx512: 'no'
Flags:
CXX: g++
CXXFLAGS: -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2
LDFLAGS: -static
Testing config sanity. If this fails, try 'make help' ...
C:/Program Files/Git/mingw64/bin/make.exe ARCH=x86-64 COMP=mingw all
make[1]: Entering directory 'C:/Users/MichaelB7/home/Github/Stockfish/src'
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o benchmark.o benchmark.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o bitbase.o bitbase.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o bitboard.o bitboard.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o endgame.o endgame.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o evaluate.o evaluate.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o main.o main.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o material.o material.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o misc.o misc.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o movegen.o movegen.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o movepick.o movepick.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o pawns.o pawns.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o position.o position.cpp
position.cpp: In member function 'Position& Position::set(const string&, bool, bool, StateInfo*, Thread*)':
position.cpp:199:39: warning: 'void* memset(void*, int, size_t)' clearing an object of non-trivial type 'struct StateInfo'; use assignment or value-initialization instead [-Wclass-memaccess]
199 | std::memset(si, 0, sizeof(StateInfo));
| ^
In file included from position.cpp:31:
position.h:40:8: note: 'struct StateInfo' declared here
40 | struct StateInfo {
| ^~~~~~~~~
position.cpp: In member function 'void Position::do_move(Move, StateInfo&, bool)':
position.cpp:714:51: warning: 'void* memcpy(void*, const void*, size_t)' writing to an object of a non-trivial type 'struct StateInfo' leaves 1272 bytes unchanged [-Wclass-memaccess]
714 | std::memcpy(&newSt, st, offsetof(StateInfo, key));
| ^
In file included from position.cpp:31:
position.h:40:8: note: 'struct StateInfo' declared here
40 | struct StateInfo {
| ^~~~~~~~~
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o psqt.o psqt.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o search.o search.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o thread.o thread.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o timeman.o timeman.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o tt.o tt.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o uci.o uci.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o ucioption.o ucioption.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o tune.o tune.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o syzygy/tbprobe.o syzygy/tbprobe.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o nnue/evaluate_nnue.o nnue/evaluate_nnue.cpp
nnue/evaluate_nnue.cpp: In function 'void Eval::NNUE::Detail::Initialize(Eval::NNUE::AlignedPtr<T>&)':
nnue/evaluate_nnue.cpp:60:45: error: 'aligned_alloc' is not a member of 'std'; did you mean 'aligned_union'?
60 | pointer.reset(reinterpret_cast<T*>(std::aligned_alloc(alignof(T), sizeof(T))));
| ^~~~~~~~~~~~~
| aligned_union
make[1]: *** [<builtin>: nnue/evaluate_nnue.o] Error 1
make[1]: Leaving directory 'C:/Users/MichaelB7/home/Github/Stockfish/src'
make: *** [Makefile:595: build] Error 2
MichaelB7@VM-894787 MINGW64 ~/home/Github/Stockfish/src (nnue-player-wip)
@MichaelB7 Please provide more details, such as what version of g++
you have and your OS/platform. Also, you could try https://github.com/official-stockfish/Stockfish/pull/2831 and see if that fixes your issue—I think the PR makes the aligned_alloc
issue more compatible in general for platforms other than macOS.
$ stockfish Stockfish 250720 64 POPCNT by T. Romstad, M. Costalba, J. Kiiski, G. Linscott compiler
Compiled by g++ (GNUC) 10.1.0 on MinGW64 VERSION macro expands to: 10.1.0
Windows 10 Pro Version 10.0.18363 Build 18363 will try your patch
Update: The patch did not resolve the issue:
$ make build ARCH=x86-64 COMP=mingw
Config:
debug: 'no'
sanitize: 'no'
optimize: 'yes'
arch: 'x86_64'
bits: '64'
kernel: 'MINGW64_NT-10.0-18363'
os: 'Windows_NT'
prefetch: 'yes'
popcnt: 'no'
sse: 'yes'
sse3: 'no'
ssse3: 'no'
sse41: 'no'
sse42: 'no'
avx2: 'no'
pext: 'no'
avx512: 'no'
Flags:
CXX: g++
CXXFLAGS: -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2
LDFLAGS: -static
Testing config sanity. If this fails, try 'make help' ...
C:/Program Files/Git/mingw64/bin/make.exe ARCH=x86-64 COMP=mingw all
make[1]: Entering directory 'C:/Users/MichaelB7/home/Github/Stockfish/src'
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -Wextra -Wshadow -DNDEBUG -O3 -DIS_64BIT -msse -DUSE_SSE2 -c -o nnue/evaluate_nnue.o nnue/evaluate_nnue.cpp
nnue/evaluate_nnue.cpp: In function 'void Eval::NNUE::Detail::Initialize(Eval::NNUE::AlignedPtr<T>&)':
nnue/evaluate_nnue.cpp:60:40: error: there are no arguments to 'aligned_malloc' that depend on a template parameter, so a declaration of 'aligned_malloc' must be available [-fpermissive]
60 | pointer.reset(reinterpret_cast<T*>(aligned_malloc(alignof(T), sizeof(T))));
| ^~~~~~~~~~~~~~
nnue/evaluate_nnue.cpp:60:40: note: (if you use '-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated)
nnue/evaluate_nnue.cpp: In instantiation of 'void Eval::NNUE::Detail::Initialize(Eval::NNUE::AlignedPtr<T>&) [with T = Eval::NNUE::FeatureTransformer; Eval::NNUE::AlignedPtr<T> = std::unique_ptr<Eval::NNUE::FeatureTransformer, Eval::NNUE::AlignedDeleter<Eval::NNUE::FeatureTransformer> >]':
nnue/evaluate_nnue.cpp:79:43: required from here
nnue/evaluate_nnue.cpp:60:54: error: 'aligned_malloc' was not declared in this scope; did you mean '_aligned_malloc'?
60 | pointer.reset(reinterpret_cast<T*>(aligned_malloc(alignof(T), sizeof(T))));
| ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
| _aligned_malloc
nnue/evaluate_nnue.cpp: In instantiation of 'void Eval::NNUE::Detail::Initialize(Eval::NNUE::AlignedPtr<T>&) [with T = Eval::NNUE::Layers::AffineTransform<Eval::NNUE::Layers::ClippedReLU<Eval::NNUE::Layers::AffineTransform<Eval::NNUE::Layers::ClippedReLU<Eval::NNUE::Layers::AffineTransform<Eval::NNUE::Layers::InputSlice<512>, 32> >, 32> >, 1>; Eval::NNUE::AlignedPtr<T> = std::unique_ptr<Eval::NNUE::Layers::AffineTransform<Eval::NNUE::Layers::ClippedReLU<Eval::NNUE::Layers::AffineTransform<Eval::NNUE::Layers::ClippedReLU<Eval::NNUE::Layers::AffineTransform<Eval::NNUE::Layers::InputSlice<512>, 32> >, 32> >, 1>, Eval::NNUE::AlignedDeleter<Eval::NNUE::Layers::AffineTransform<Eval::NNUE::Layers::ClippedReLU<Eval::NNUE::Layers::AffineTransform<Eval::NNUE::Layers::ClippedReLU<Eval::NNUE::Layers::AffineTransform<Eval::NNUE::Layers::InputSlice<512>, 32> >, 32> >, 1> > >]':
nnue/evaluate_nnue.cpp:80:31: required from here
nnue/evaluate_nnue.cpp:60:54: error: 'aligned_malloc' was not declared in this scope; did you mean '_aligned_malloc'?
60 | pointer.reset(reinterpret_cast<T*>(aligned_malloc(alignof(T), sizeof(T))));
| ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
| _aligned_malloc
make[1]: *** [<builtin>: nnue/evaluate_nnue.o] Error 1
make[1]: Leaving directory 'C:/Users/MichaelB7/home/Github/Stockfish/src'
make: *** [Makefile:595: build] Error 2
But if I make the suggested change from the error message above , it worked.
replacing this on line 60:
pointer.reset(reinterpret_cast<T*>(aligned_alloc(alignof(T), sizeof(T))));
with this:
pointer.reset(reinterpret_cast<T*>(_aligned_malloc(alignof(T), sizeof(T))));
@dorzechowski one more valgrind warning triggers when loading TB files:
uciok
setoption name SyzygyPath value ../tests/syzygy/
==122921== Conditional jump or move depends on uninitialised value(s)
==122921== at 0x13DF43: Position::set(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool, StateInfo*, Thread*) (position.cpp:222)
==122921== by 0x13EFDE: Position::set(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, Color, StateInfo*) (position.cpp:407)
==122921== by 0x16D412: (anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>::TBTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (tbprobe.cpp:366)
==122921== by 0x173AB9: void __gnu_cxx::new_allocator<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0> >::construct<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>((anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (new_allocator.h:147)
==122921== by 0x171CFC: void std::allocator_traits<std::allocator<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0> > >::construct<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::allocator<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0> >&, (anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (alloc_traits.h:484)
==122921== by 0x171D97: void std::deque<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>, std::allocator<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0> > >::_M_push_back_aux<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (deque.tcc:496)
==122921== by 0x170C2F: (anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>& std::deque<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>, std::allocator<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0> > >::emplace_back<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (deque.tcc:174)
==122921== by 0x16DAFB: (anonymous namespace)::TBTables::add(std::vector<PieceType, std::allocator<PieceType> > const&) (tbprobe.cpp:489)
==122921== by 0x16F2A6: Tablebases::init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (tbprobe.cpp:1367)
==122921== by 0x167A8C: UCI::on_tb_path(UCI::Option const&) (ucioption.cpp:44)
==122921== by 0x1699F0: UCI::Option::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (ucioption.cpp:202)
==122921== by 0x163837: (anonymous namespace)::setoption(std::__cxx11::basic_istringstream<char, std::char_traits<char>, std::allocator<char> >&) (uci.cpp:100)
==122921==
==122921== Conditional jump or move depends on uninitialised value(s)
==122921== at 0x13DF43: Position::set(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool, StateInfo*, Thread*) (position.cpp:222)
==122921== by 0x13EFDE: Position::set(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, Color, StateInfo*) (position.cpp:407)
==122921== by 0x16D62F: (anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>::TBTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (tbprobe.cpp:385)
==122921== by 0x173AB9: void __gnu_cxx::new_allocator<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0> >::construct<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>((anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (new_allocator.h:147)
==122921== by 0x171CFC: void std::allocator_traits<std::allocator<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0> > >::construct<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::allocator<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0> >&, (anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (alloc_traits.h:484)
==122921== by 0x171D97: void std::deque<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>, std::allocator<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0> > >::_M_push_back_aux<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (deque.tcc:496)
==122921== by 0x170C2F: (anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>& std::deque<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0>, std::allocator<(anonymous namespace)::TBTable<((anonymous namespace)::TBType)0> > >::emplace_back<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (deque.tcc:174)
==122921== by 0x16DAFB: (anonymous namespace)::TBTables::add(std::vector<PieceType, std::allocator<PieceType> > const&) (tbprobe.cpp:489)
==122921== by 0x16F2A6: Tablebases::init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (tbprobe.cpp:1367)
==122921== by 0x167A8C: UCI::on_tb_path(UCI::Option const&) (ucioption.cpp:44)
==122921== by 0x1699F0: UCI::Option::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (ucioption.cpp:202)
==122921== by 0x163837: (anonymous namespace)::setoption(std::__cxx11::basic_istringstream<char, std::char_traits<char>, std::allocator<char> >&) (uci.cpp:100)
==122921==
info string Found 35 tablebases
Edit: this should be fixed by:
diff --git a/src/position.cpp b/src/position.cpp
index 52ba710ae..8f2c4b029 100644
--- a/src/position.cpp
+++ b/src/position.cpp
@@ -404,7 +404,7 @@ Position& Position::set(const string& code, Color c, StateInfo* si) {
string fenStr = "8/" + sides[0] + char(8 - sides[0].length() + '0') + "/8/8/8/8/"
+ sides[1] + char(8 - sides[1].length() + '0') + "/8 w - - 0 10";
- return set(fenStr, false, use_nnue(), si, nullptr);
+ return set(fenStr, false, false, si, nullptr);
}
This set() method is basically just there to create a materials key, and starts from a freshly created Position() object.
@MichaelB7 Thank you. I can now at least compile. However, I believe the parameters for aligned_alloc and _aligned_malloc should be reversed. Also we should probably match it with _aligned_free(ptr) instead of std::free(ptr).
first green CI badge on Linux/gcc https://travis-ci.org/github/official-stockfish/Stockfish/jobs/711914367
@vondele Another thing to ascertain before we can use NNUE in SF is a license to use it. As nodchip's implementation doesn't have any license attached, I'm not sure we can even use it legally now. We assume so but it's not enough I think. This should be solved together with @nodchip and those files should have either GPLv3 or compatible license attached.
@dorzechowski would be great @nodchip confirms this, but as the code (binaries and sources) are being distributed already, the license is GPL, this is also what the repository Copying file in the repository mentions.
Meanwhile also green on Linux/clang : https://travis-ci.org/github/official-stockfish/Stockfish/jobs/711961757
@dorzechowski would be great @nodchip confirms this, but as the code (binaries and sources) are being distributed already, the license is GPL, this is also what the repository Copying file in the repository mentions.
That's great. All this licensing can be very confusing.
I can confirm the x86-64-avx512 gives the same 3377227 bench as above.
@vondele I believe task "remove non-player related code" is done already. I see nothing that could be removed without breaking it. Of course possible optimizations is another thing.
@dorzechowski OK, I'll tick that box, assuming you had a careful look.
I have also verified bench on linux-rhel7-ppc64le using gcc version 9.1.0 (GCC)
There are warnings however:
nnue/evaluate_nnue.cpp: In function ‘Value Eval::NNUE::ComputeScore(const Position&, bool)’:
nnue/evaluate_nnue.cpp:124:61: warning: requested alignment 64 is larger than 16 [-Wattributes]
124 | transformed_features[FeatureTransformer::kBufferSize];
| ^
nnue/evaluate_nnue.cpp:126:61: warning: requested alignment 64 is larger than 16 [-Wattributes]
126 | alignas(kCacheLineSize) char buffer[Network::kBufferSize];
| ^
seemingly that's not a bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68267) and gcc provides a __BIGGEST_ALIGNMENT__
macro which is arch specific (https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html). It is however different from 64 for most arch I tried (except knl).
Maybe there is an easy solution?
@vondele Another thing to ascertain before we can use NNUE in SF is a license to use it. As nodchip's implementation doesn't have any license attached, I'm not sure we can even use it legally now. We assume so but it's not enough I think. This should be solved together with @nodchip and those files should have either GPLv3 or compatible license attached.
AFAIK YaneuraOu is the original author and wrote about it with his full support:
http://yaneuraou.yaneu.com/2020/06/21/stockfish-nnue-subjective-review/
https://twitter.com/nodchip/status/1286947373374087168/retweets/with_comments
YaneuraOu are distributed under GPLv3. https://github.com/yaneurao/YaneuraOu I also intended that the ported code and code that I wrote are distributed under GPLv3. If there are additional works to express that those code are distributed under GPLv3, please let me know.
Do we know why the bench of the @nodchip branch with the default net is 3909820 but our nnue-player-wip is 3377227? The reason I ask is because I played a local STC match between SF and SFNNUE and after 100 games SF won them all. Obviously something seems broken. @dorzechowski Did you do any play testing w/ your branch and what were the results?
@nodchip thanks for confirming the license. There is nothing particular you need to do, I might add a copyright header to the added files, that makes things clear.
@mstembera I think I did play with the branch, and had good results. Obviously that's something to clarify. I'm not sure both branches have the same version of master merged.
@mstembera quick check of the current status of master vs branch looks good for me:
Score of master vs nnue: 17 - 49 - 84 [0.393] 150
Elo difference: -75.3 +/- 36.7, LOS: 0.0 %, DrawRatio: 56.0 %
@vondele Thanks. I tried testing using the chessbase gui and all is fine there. Seems I only have the issue when using cutechess-cli. If I figure out what I'm doing wrong I'll report.
I have used:
./cutechess-cli -repeat -rounds 10000 -games 2 -tournament gauntlet -resign movecount=3 score=400 -draw movenumber=34 movecount=8 score=20 -concurrency 15 -openings file=noob_3moves.epd format=epd order=random plies=16 -engine name=master cmd=stockfish.master -engine name=nnue cmd=stockfish.nnue option.EvalFile=/home/vondele/chess/match/nn-c157e0a5755b.bin option."Use NNUE"=true -ratinginterval 1 -each tc=10.0+0.1 proto=uci option.Threads=1 -pgnout nnue.pgn
If I have to guess, most likely missing the full path to EvalFile ?
Meanwhile results with a few more games:
Score of master vs nnue: 498 - 1019 - 1879 [0.423] 3396
Elo difference: -53.7 +/- 7.8, LOS: 0.0 %, DrawRatio: 55.3 %
Those are encouraging results. I wasn't using the EvalFile path because I have the net in the same directory as the .exe Adding it anyway seems to make no difference. When I test nnue vs nnue things seem normal except the draw ration is only about 10%. Most wins are by adjudication as normal.
@mstembera I think cutechess has the current working directory independent of where the engine binary is, and as a result the binary doesn't find the EvalFile, unless given explicitly with the full path. Adding -debug
should show that in the output. After printing the error message the code continues, with a zero-ed net.
@vondele Yes that was it. Thanks! I would have never guessed cutechess was overriding the current directory. I wonder if we should just fail when "Use NNUE"=true but the network fails to load?
you are touching on a difficult point, and that is how to deal with the net in general.
For example, do we expect people who build the code to download the default net to the src directory ? This would make it relatively easy to enable profile-build. However, to figure out the name of the default net, one should first build the binary, execute uci and look at EvalFile. So, require a build, afterward a profile-build.
Similar question for the bench command. One could test both NNUE and handcrafted eval in a single bench, if we assume the developer has the default net in the src directory. Similar question of course if 'Use NNUE' would default to true.
Probably that's something we could expect for developers, but I'm sure we will all be bitten by this requirement at some point. Other opinions?
Yes this isn't straight forward. I think it's important than when inevitably something goes wrong it doesn't happen silently. This makes it much harder for a person to track down. The current net is about 21MB and compresses to about 8MB. Is it out of the question to simply add it to the source directory along with the source files?
@vondele Here is a commit by @nodchip to solve the issue of allowing the net path to be relative to the binary. https://github.com/nodchip/Stockfish/commit/fbdb373b6482db2b462b30d7399c3c7fed1d4f26
will profile build without net in directory give same optimizer compiler flags as with net in directory? does it matter if use nnue is false default? agree with @mstembera - good to include default net in source directory and have net in same directory as exe by default
@mstembera the problem with adding to the source directory is that the git repo will become very large to clone 20Mb with each new net. We might also run into github bandwidth limits if there are too many downloads. However, I don't think we should add the code you referenced for 'finding' the binary. This is OS/GUI specific, and right now, the problem is not very different from specifying the syzygy directory. Also in that case a relative path doesn't work.
@stockchess there is some impact of having NNUE false by default for the profile-build ... if the code is not executed the profiling will not gather statistics for it, and thus optimize less. The difference is likely not spectacular, but will be visible.
Meanwhile two tests with using nnue=false show some overhead, wrt to master.
https://tests.stockfishchess.org/tests/view/5f1fddb7c09435d870cb9f15 https://tests.stockfishchess.org/tests/view/5f1ff3bdc09435d870cb9f25
it is just below the self-put target of 5 Elo, but would be nice to improve upon this.
@vondele Can't the old net be purged from the repo each time we add a new one so that it never has more than one and doesn't grow?
technically, yes, but that's kind of opposite of what a version control system like git wants, and certainly not good style https://docs.github.com/en/github/managing-large-files/removing-files-from-a-repositorys-history
@mstembera Yes, I played a lot of test matches in cutechess-cli, LittleBlitzer and Arena and the results were as good as reported by other people. I don't have any Chessbase GUI to test. I also carefully checked that bench didn't change after I removed/refactored some code. I also updated it to latest master. It's still possible that I've missed something but I'm pretty confident I haven't.
The NNUE branch maintained by @nodchip has demonstrated strong results and offers great potential, and we will proceed to merge it into master. This will assure that Stockfish remains a reference engine based on the top computer chess technology for the foreseeable future. This merge will introduce machine learning based coding to the engine, thus enlarging the community of developers, bringing in new skills. We are eager to keep everybody on board, including all developers and users of diverse hardware, aiming to be an inclusive community. We will keep and evolve the handcrafted evaluation, not only for the beauty and insight it offers, but for additional reasons, including its value on diverse hardware platforms, and for the potential of hybridization with NNUE, and use in search. The initial goal for the merge is the NNUE playing capability only, this focus makes sense given the complexity of the project, and the aim of Stockfish as a production quality chess engine. We will try to make changes such that the @nodchip repo and ours can be kept in sync as easily as possible. Written by knowledgeable people, NNUE is already of good quality, we will make small modifications needed to make it pass our standard CI, and integrate nicely in the existing engine. Rigorous testing will continue to be a cornerstone of our development, and accordingly integration in fishtest needs to be properly implemented. Afterwards, we will guarantee continuous progression in playing strength testing core engine and networks with our usual SPRT process. Some initial steps of the merge will be outlined below, the precise steps needed will become clearer as we proceed, I look forward to working with the community to make this happen!