Closed AlexB123 closed 4 years ago
Hi again! As i mentioned above i made android engine by changing the line to "std_aligned_alloc". Engine working, but only without the network, if i mark the network option engine crashes instantly. Any ideas how to fix this?
Btw, why the name of the network must be "nn-c157e0a5755b.nnue"? It's not much easier to call it "nn.nnue"?
You can create a PR here https://github.com/official-stockfish/Stockfish/tree/nnue-player-wip adding an ARM section to std_aligned_alloc() and std_aligned_free() in misc.cpp. The net is named with the first 12 characters of the SHA256 hash. This is so that when the default nets change we can uniquely tell them apart. You can use any name you like for your custom net and specify it as a UCI option.
I've tried this change but it didn't work..
You can create a PR here https://github.com/official-stockfish/Stockfish/tree/nnue-player-wip adding an ARM section to std_aligned_alloc() and std_aligned_free() in misc.cpp. The net is named with the first 12 characters of the SHA256 hash. This is so that when the default nets change we can uniquely tell them apart. You can use any name you like for your custom net and specify it as a UCI option.
Hello, thank you for the feedback! I don't know how to create a PR. I'm not a programmer, so can you bring an example regarding the flags "std_aligned_alloc() and std_aligned_free()" in misc.cpp, how to add them (i mean in what order)? I've tried like this, it didn't work, apparently i did something wrong. Thank you!
Sorry, forgot to mention. I made SF NNUE from nodechip source code https://github.com/nodchip/Stockfish Engine working fine and it is reading nn.bin, although the speed of the engine is too slow comparing to normal SF, plus without the nn.bin, engine is horrible, look at analysis.
Yes - his fork was designed for NNUE only - will not play well without bin.
Yes - his fork was designed for NNUE only - will not play well without bin.
Tnx, i didn't know that. I thought that without nn.bin it suppose to play like normal SF, anyway now i know. :-)
Yes - he made it two different executables, the Stockfish team is making it a UCI option , one exe that can play both.
@AlexB123 I created the PR for you here https://github.com/official-stockfish/Stockfish/pull/2872 Thanks.
Edit: Actually I may have done it wrong. Is this an ARM thing or an Adnroid thing? We can use either defined(IS_ARM) or defined(_ANDROID_)
Hello guys, just wanted to let you know that android version still crashing. :(
Flags that I use to build the engine.
set "compiler_options=-m64 -march=armv8-a -DIS_64BIT -fPIE -Wl,-pie -lm -DUSE_POPCNT -DNO_PREFETCH -DUSE_NEON -O3 -flto -static-libstdc++ -std=c++17 -fno-strict-aliasing -fno-strict-overflow -ffunction-sections -fdata-sections -Wl,--gc-sections -Wl,-s"
Btw, from old nodchip's source code https://github.com/nodchip/Stockfish , engine is generated and working with follow flags ->
set "compiler_options=-m64 -march=armv8-a -DIS_64BIT -fPIE -Wl,-pie -lm -DUSE_POPCNT -DEVAL_NNUE -DENABLE_TEST_CMD -fopenmp -O3 -flto -static-libstdc++ -std=c++17 -fno-strict-aliasing -fno-strict-overflow -ffunction-sections -fdata-sections -Wl,--gc-sections -Wl,-s"
Maybe this can help you somehow to solve the issue.
Thank you!
So, it compiles but crashes at runtime?
Edit: at which point does the crash happen, i.e. do you have any output, and which UCI commands do you send?
So, it compiles but crashes at runtime?
Edit: at which point does the crash happen, i.e. do you have any output, and which UCI commands do you send?
Hi vondele! The engine compiles, with small correction in misc.cpp line 329, by changing to "return std_aligned_alloc(alignment, size);", or using this changes https://github.com/official-stockfish/Stockfish/pull/2872/commits/af6473aa5bc5f7adbc912658aa8c3671ce9ad967 Engine working fine in Droidfish, as normal Stockfish. But, when i mark the "Use NNUE" option in engine's settings, it's crashes instantly, with the message "engine terminated".
I have a feeling that some flag is missing in Makefile, that is responsible for applying NNUE in the engine. Since i don't know which is that flag, i don't know what to write in the batch file, so compiler generates normal engine, not able to use NNUE, or, NDK's Clang is unable to cooperate with NNUE. I don't know how else to explain these crashes.
@AlexB123 that change you make (i.e. calling std_aligned_alloc) is not OK. It will compile but crash. Can you try instead of your change the change proposed https://github.com/official-stockfish/Stockfish/pull/2927 i.e. https://github.com/official-stockfish/Stockfish/pull/2927/files
Commands execution in SManager for Android. bench
uci
setoption name Use NNUE value true
@AlexB123 that change you make (i.e. calling std_aligned_alloc) is not OK. It will compile but crash. Can you try instead of your change the change proposed #2927 i.e. https://github.com/official-stockfish/Stockfish/pull/2927/files
Ok, i'll try it later, i have to go now. :)
@AlexB123 that change you make (i.e. calling std_aligned_alloc) is not OK. It will compile but crash. Can you try instead of your change the change proposed #2927 i.e. https://github.com/official-stockfish/Stockfish/pull/2927/files
Hello! Having tried new flags, compiler gives a new error.
can you try to #include <stdlib.h>
in the file?
can you try to
#include <stdlib.h>
in the file?
Not sure if i did it correctly -> misc.cpp, +line 52 "#include
So can you instead try to use this:
void* std_aligned_alloc(size_t alignment, size_t size) {
// alignment must be >= sizeof(void*)
if(alignment < sizeof(void*))
{
alignment = sizeof(void*);
}
void *pointer;
if(posix_memalign(&pointer, alignment, size) == 0)
return pointer;
return nullptr;
}
leave #include <stdlib.h>
in the file near line 56.
void std_aligned_alloc(size_t alignment, size_t size) { // alignment must be >= sizeof(void) if(alignment < sizeof(void)) { alignment = sizeof(void); } void *pointer; if(posix_memalign(&pointer, alignment, size) == 0) return pointer; return nullptr;
With this changes engine compiles, without errors or warnings, but again, it is crashes when i mark the "Use NNUE" box. It's working as normal engine only.
That code looks right, so, probably we're having a different reason for a crash. (unless the code returns a nullptr). I assume you have the right 'ARCH=...' option for the make command ?
To move on we need to be able to understand where it crashes. Usually that would mean to compile (after make clean) with debug=yes optimize=no
flags to make, and afterwards run it under gdb
like
gdb ./stockfish
run
setoption name Use NNUE value true
bench
[crash]
bt
That code looks right, so, probably we're having a different reason for a crash. (unless the code returns a nullptr). I assume you have the right 'ARCH=...' option for the make command ?
To move on we need to be able to understand where it crashes. Usually that would mean to compile (after make clean) with
debug=yes optimize=no
flags to make, and afterwards run it undergdb
likegdb ./stockfish run setoption name Use NNUE value true bench [crash] bt
I use flag -march=armv8-a in my batch file, for amr8 64 bit engines. Since the engine is working, but only as normal SF, the flag / ARCH is correct. I'll try to make engine without -flto and -DUSE_POPCNT, and let you know later if something changes. Regards. Alex.
Well team, i give up. I used last source code, the first issue with compiling still remain.
By using all the mentioned (above) changes in misc.cpp, engine compiles but not 100% functional. it can execute commands like "uci" and "bench", but it fails to execute "setoption name Use NNUE value true", simply put, it working only without "Use NNUE" option. I've tried several flags "-DNDEBUG", "-DUSE_NEON", "-O3", and without all this flags, nothing works. There must be a flag(s) in the Makefile or misc.cpp which is responsible for applying of NNUE functions on the engine, but i don't know which flag is that. Maybe Peter Österlund can help? http://talkchess.com/forum3/viewtopic.php?p=853010#p853010
Thank you, vondele. Your patch in this thread (as it appears in AlexB123's screenshot) allowed the compile to finish. I think the binary is actually working too.
I can build for both aarch64 and armv7, but I can only test armv7 binaries right now.
@AlexB123 DroidFish doesn't seem to like the Use NNUE
checkbox option. It crashes and/or wouldn't start. My binary appears to be working alright in Chess for Android, and also in a terminal emulator app. Maybe you might want to try your build in those apps instead, although it sounds like yours was crashing in the terminal emulator too?
You may have already noticed this: the current official branch wants the .nnue file to be in the same folder as the engine, not in a sub-folder any more. Chess for Android requires the .nnue file to be installed the same way as an engine, so I assume they're being put in the same dir.
I'll upload my aarch64 build, in case you want to test it. Let me know how it works. The only change is vondele's patch applied to misc.cpp, and I used my usual build flags (somewhat different from what you've posted above).
It is based on this commit iirc : https://github.com/official-stockfish/Stockfish/commit/ad2ad4c65706c18a5383506d361f1f23fc6a26ab
In the terminal emulator, I first ran a bench
and the speed was on par with what I usually get for regular non-NNUE Stockfish.
Then, without bringing the .nnue file into the terminal emulator yet, I did a setoption name Use NNUE value true
followed by another bench
... This time, I get a warning text Use of NNUE evaluation, but the file ____.nnue was not loaded successfully.
and so on. The benchmark didn't run.
After I have the correct .nnue file, I set Use NNUE
again and this time the benchmark did run, and at a significantly lower speed than before. So I assumed my armv7 build was actually using NNUE.
@notruck so for you, on android, the current master (i.e. calling aligned_alloc) does not build?
However, if you use the code based on posix_memalign, it does work?
What are your usual build flags, i.e. is there anything we can do to making building on android easier?
for armv7, I used
CXXFLAGS += --target=armv7a-linux-androideabi16 -fno-addrsig -stdlib=libc++ -O3 -Ofast -mfpu=neon-vfpv4 -mthumb -march=armv7-a -mtune=cortex-a53 -mfloat-abi=softfp -Wall -Wcast-qual -fno-exceptions -std=c++17 $(EXTRACXXFLAGS)
DEPENDFLAGS += -std=c++17
LDFLAGS += -static-libstdc++ -latomic $(EXTRALDFLAGS) # -fuse-ld=lld
The current master is still failing to build, both for clang 9.0.8 included with Google's NDK r21, and clang 10.0.0 provided by Termux. It leads to the same exact problem AlexB described in his first post.
What worked for me was your patch above, exactly as it appears in AlexB123's screenshot, along with the #include <stdlib.h>
@notruck I'll try to make a PR that does include that code snippet, and would appreciate if you test it, once is there.
for armv7, I used
CXXFLAGS += --target=armv7a-linux-androideabi16 -fno-addrsig -stdlib=libc++ -O3 -Ofast -mfpu=neon-vfpv4 -mthumb -march=armv7-a -mtune=cortex-a53 -mfloat-abi=softfp -Wall -Wcast-qual -fno-exceptions -std=c++17 $(EXTRACXXFLAGS) DEPENDFLAGS += -std=c++17 LDFLAGS += -static-libstdc++ -latomic $(EXTRALDFLAGS) # -fuse-ld=lld
We need to keep armv7 flags for android and RPI separate...
these are RPI flags that will work with most RPI
"-mfloat-abi=hard -mfpu=neon-fp-armv8 -mneon-for-64bits -mtune=cortex-a53" otherwise Pi uses the normal GCC 10 flags - 32 bitis still standard for the RPI- but users can now add a 64 bit kernel and use 64 bit exe's on a 32 bit RPI OS - best to leave 32 bit default.
@vondele Thanks! So the cross-compilation with NDK went smoothly for both armv7 and armv8-a. I haven't yet tested the binaries themselves however.
Currently I don't have the hardware to test armv8-a. Let me attach them here, in case someone wants to help with testing that:
(edit: Oops, I forgot to run make clean
for the armv8a build, let me correct that)
Fixed: aligned_alloc_changes.zip
Also, the armv8 build explicitly uses -fPIE and -pie flags now.
@notruck thanks for compiling, please let me know if they test OK.
This PR #2973 would also need testing for OSX and old Linux. @TonHaver @ddugovic does this PR still work on your systems?
Works on my old macbook using MacOS 10.14 Don't have anything newer
Nodes searched : 4094850 (without NNUE)
Nodes searched : 3314442 (with NNUE)
Is this discrepancy something to be expected? I used the default nn-112bb1c8cdb5.nnue net
yes that's correct, completely different eval function. (Bench matches x86)
@vondele and @notruck Hello guys! So let me get it right. :) I have the last source code "Cleanup and optimize SSE/AVX code". To build the full functional engine, i have to do changes (picture below) plus changes mentioned here https://github.com/official-stockfish/Stockfish/pull/2973/commits/74bb29abd1967a9e47dd8913e58d9fbd6efc308d , right?
P.s. @notruck are you using this Termux (with Clang 10.0.0), and which flags you are using for armv8? https://play.google.com/store/apps/details?id=com.termux
@AlexB123 probably best to test the code I have as a pull request https://github.com/official-stockfish/Stockfish/pull/2973 I'll merge this in master on a next round.
@AlexB123 probably best to test the code
Sorry, bad news. Same issue (in different lines) appears using "original" source code. With using changes pointed here https://github.com/official-stockfish/Stockfish/pull/2973/commits/74bb29abd1967a9e47dd8913e58d9fbd6efc308d , engine compiles, but still it is unable to use NNUE. Without NNUE working fine.
@notruck, i've tried your compilation armv8 aligned_alloc_changes.zip
it's also crashes same as mine. :( I have installed Google NDK's toolchains from r21, r21b, r21d, none of them can compile the engine from source code.
@AlexB123 probably best to test the code
Sorry, bad news. Same issue (in different lines) appears using "original" source code. With using changes pointed here 74bb29a , engine compiles, but still it is unable to use NNUE. Without NNUE working fine.
@notruck, i've tried your compilation armv8
aligned_alloc_changes.zip
it's also crashes same as mine. :( I have installed Google NDK's toolchains from r21, r21b, r21d, none of them can compile the engine from source code.
That is a bit strange. In Cfa engine is not crashing using NNUE in analysis mode and it showing a message "classical evaluation enabled", does it mean that it's using NNUE?
no classical evaluation is not NNUE
Ok, some of good news. Engine made with corrections mentioned on my above post, is working (so is notruck's aligned_alloc_changes.zip). The trick is, firstly i need to write the path of the network in the engine's settings, and only after mark the "Use NNUE" option. With this way engine is working, but in fact for some reason does not use the network. Еg it can't solve this positions at all
rn1qrnk1/p4pp1/1p1pp3/6P1/2Pp1PN1/2PQ4/P5P1/2KR3R w - - 0 1
4q1kr/p6p/1prQPppB/4n3/4P3/2P5/PP2B2P/R5K1 w - - 0 1
, while Peter's engine from talkchess (mentioned above) finding solutions in seconds. So, something is still missing in the code. NNUE functions are not applied on armv engines.
@AlexB123 Peter's engine is from late July, and based on an earlier version, before this commit happened about 2 weeks ago.
Is his engine working OK with the current .nnue nets? Could you please confirm where you are putting the .nnue files? They no longer go inside the eval
folder. The engine now expects them to be in the same directory. If the issue isn't with the path and/or filenames, we can try following Peter's methods on the current code.
Looking at his post on TalkChess, it seems Peter also uses the NDK (r20b). He made only minimal changes to the official Makefile. He disables the lpthread, as supposed to be done for Android. He uses the -static
LD flag. He doesn't include most other flags we use.
He made sure to include the neon = yes
, or by settng a slightly different D USE_NEON flag. They both should do the same thing (as seen here ).
Next, I'll try to build the latest version by following his post. I'll remove all other/extra build flags. I'll use the latest Stockfish-master, and compile it with NDK r21d.
P.S. I normally use the NDK on Linux. I had the r21 (earliest r21 without any letters) before. When it failed, I downloaded r21d. They both have Clang 9.0.8, so I tried Termux next (same app you found/linked above). Termux provides a Clang 10.0.0 but that didn't work either on (untouched) Stockfish-master. I didn't try the patches with Termux, I returned to NDK for those. Termux is a good terminal emulator, but I didn't get far enough with getting it to compile successfully.
Following Peter's changes to the Makefile wouldn't build the current-master. This is probably to be expected.
His Makefile builds this without any problems: https://github.com/vondele/Stockfish/tree/74bb29abd1967a9e47dd8913e58d9fbd6efc308d
The -static binary is larger than my previous one, and may or may not work better at loading the .nnue. It expects the https://tests.stockfishchess.org/api/nn/nn-112bb1c8cdb5.nnue as the default net.
OK, so we'll go in steps. First thing, I'll make the commit of PR https://github.com/official-stockfish/Stockfish/pull/2973 so the source builds on Android without modifications of the src. Later, we should revist the Makefile, and it is great if we have people able to test it. So I'll leave this issue open after the commit.
With this way engine is working, but in fact for some reason does not use the network. Еg it can't solve this positions at all
rn1qrnk1/p4pp1/1p1pp3/6P1/2Pp1PN1/2PQ4/P5P1/2KR3R w - - 0 1 4q1kr/p6p/1prQPppB/4n3/4P3/2P5/PP2B2P/R5K1 w - - 0 1
, while Peter's engine from talkchess (mentioned above) finding solutions in seconds. So, something is still missing in the code. NNUE functions are not applied on armv engines.
@AlexB123 Which network file do you use with Peter's engine? Using the current default nn-112bb1c8cdb5.nnue
my devices are not finding the solutions. Not just my Android phone, but my x86-64 laptop also seems to be missing the solutions. I tried abrok binaries too, without any luck.
@vondele Sorry to bother you with this, could you please test those positions above and comment a little on them? Are they supposed to be reliable indicators whether the NNUE is loaded correctly?
I'd think running bench
twice (before and after setoption name Use NNUE value true
) should sufficiently indicate if the NNUE network is in use, but AlexB's apparent success on armv8 with Peter's engine is making me wonder what I might be missing.
`
With this way engine is working, but in fact for some reason does not use the network. Еg it can't solve this positions at all
rn1qrnk1/p4pp1/1p1pp3/6P1/2Pp1PN1/2PQ4/P5P1/2KR3R w - - 0 1 4q1kr/p6p/1prQPppB/4n3/4P3/2P5/PP2B2P/R5K1 w - - 0 1
, while Peter's engine from talkchess (mentioned above) finding solutions in seconds. So, something is still missing in the code. NNUE functions are not applied on armv engines.
@AlexB123 Which network file do you use with Peter's engine? Using the current default
nn-112bb1c8cdb5.nnue
my devices are not finding the solutions. Not just my Android phone, but my x86-64 laptop also seems to be missing the solutions. I tried abrok binaries too, without any luck.@vondele Sorry to bother you with this, could you please test those positions above and comment a little on them? Are they supposed to be reliable indicators whether the NNUE is loaded correctly?
I'd think running
bench
twice (before and aftersetoption name Use NNUE value true
) should sufficiently indicate if the NNUE network is in use, but AlexB's apparent success on armv8 with Peter's engine is making me wonder what I might be missing.`
@notruck I use this net, don't remember the day of release, but it is Sergio's network from earlier releases (below). Peter's engine solving those two positions in seconds. To use the network with Peter's engine, you need to create a folder named "eval" and put the network inside. The usual path to Android's memory is /storage/emulated/0, so my path for the network is
/storage/emulated/0/eval/nn.bin
write the same path in the engine's settings, done.
eval.zip
@vondele and the rest team members :) , Congrats!! Using current master "Tweak castling extension", without any changes. Engine compiles without errors and it can use NNUE!! :D
Although, it is not able to solve the mentioned two positions using the same network from my above post. I guess it's because of the patch mentioned by @Joachim26.
Try this arm8 (NDK r21), don't forget to write the correct path of the network in the engine's settings. SF-NNUE-r21.zip
Hi is this something new or it is already included in the current master? https://github.com/lucabrivio/Stockfish/commit/3e9562123f62d406500c69bc5193f238ece350b9
@notruck can i ask you something, for armv7 engines, did you use NDK r17? As far as I know, this is the latest NDK that supports armv7 architecture, and it is not cooperate with -DUSE_NEON
.
Also, which flags you are using for "static" builds? I made armv8 static, it is bigger in size, and it's not working, apparently i messed up with flags.
And last question, do you use a batch file for compiling the engines, i mean first you generate the standalone toolchain from NDK, and then compile the engines using a batch file?
P.s. sorry for off-topic.
@AlexB123 I use NDK r21 on Linux for both armv7 and armv8 builds. Their latest revision r21d should also work. I still haven't figured out how to cross-compile PGO using the NDK.
I target Android API Level 16 (JellyBean 4.1.x) for armv7, and Android API Level 21 (Lollipop 5.0) for maximum compatibility. API 16 because it's the earliest target still supported by r21, and API 21 because it's when 64-bit Android was first introduced.
My Makefile is a total mess, but their net result is to pass these CXXFLAGS for armv8 at build-time:
aarch64-linux-android21-clang++ --target=aarch64-linux-androideabi21 -stdlib=libc++ -O3 -Ofast -fPIE -march=armv8-a -fno-addrsig -stdlib=libc++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -DNDEBUG -O3 -DIS_64BIT -DUSE_POPCNT -DUSE_NEON -flto -c -o benchmark.o benchmark.cpp
(same flags for each .cpp file)
The LDFLAGS are:
aarch64-linux-android21-clang++ -o stockfish benchmark.o bitbase.o bitboard.o endgame.o evaluate.o main.o material.o misc.o movegen.o movepick.o pawns.o position.o psqt.o search.o thread.o timeman.o tt.o uci.o ucioption.o tune.o tbprobe.o evaluate_nnue.o half_kp.o -mfpu=neon-vfpv4 -mthumb -mfloat-abi=softfp -static-libstdc++ -latomic -fPIE -pie --target=aarch64-linux-androideabi21 -stdlib=libc++ -O3 -Ofast -fPIE -march=armv8-a -fno-addrsig -stdlib=libc++ -Wall -Wcast-qual -fno-exceptions -std=c++17 -DNDEBUG -O3 -DIS_64BIT -DUSE_POPCNT -DUSE_NEON -flto
Some of them maybe redundant/extraneous but I hope they are sound at least. If these flags work out for you, please let me know.
Hello SF team! I have an issue with Android compilations. I'm using NDK's Clang 9.0.8 (last available), on Windows 64 bit. During the compilation i'm getting this error.. After I changed the line 331 (in misc.cpp) to "std_aligned_alloc", i managed to make android engine. Regarding the NETs - where should i put the NN.bin file (eg separate folder, "eval/nn.bin)? And what is the correct name of the net, is it nn.bin or nn.nnue? Thank you!!