Open acristescu opened 4 years ago
So he modified the 20b network to work faster on Android 64-bits? Is there any loss of the quality of play? Which net did he pick to optimize (there are quite a few 20b ones).
@acristescu I think there's no loss in quality of play. Its probably the last 20b extended training, i can check it with a hex editor next weekend. With 1 thread on my device, it does 12 playouts in the same time it used to do 2! it is twice as fast as leela zero.
im curious as to if when you try the new binary with that net in the command line or in your app, do you see the same speed increase? if so, you might be able to reduce visits by a factor of 4 and get the same strength, but the downside would be a bigger apk.
The new 20b net is much faster at visit parity than the old 15b. ive only tested with 1 thread though.
It can even play instant moves on ancient devices. my $220 e ink ereader plays instantly with the new net. I think probably same for orange pi users https://www.amazon.com/dp/B0824PZ45Y?ref=myi_title_dp
network remains the same, its use is optimized
Not quite true, what it looks like he did is converted the model to Google's TensorFlow lite format and somehow made KataGo work with tensorflow. This is quite impressive, as the new format should be both faster (and importantly for mobile) quite compact as well. I'm not sure if @lightvector knows about this, but I think this would be something really cool to include back in the main project.
However, I could not get it to run, probably because the command line parameters are different. @aki65 could you kindly share what the new command lines parameters are (in the spirit of open-source)?
unclear. if unpack the apk, then unpack "private.mp3", like a zip, there is a part of the code that completely coincides with the official network g170e-b20c256x2-s5303129600-d1228401921.bin.gz. (7ED1600h arm64-v8a-rel-0.21)
The old network is still there, but now there's a new file called 20b.tflite
. If you try to just run the katago binary included in 0.21 you also get an error saying that it's missing libtensorflowlite.so
. This did not happen with the old 0.16. From this I have deduced that he somehow made katago work with TensorFlow.
this is very interesting, but it doesn't work for me v64. so is there a new network or not? or is it not clear?
only @aki65 can answer these questions for certain.
it doesn't work for me v64
Do you mean LazyBaduk does not work for you or that you cannot run the new katago binary? I can't get it to work either...
only 32bit version works on my devices. for some reason also on the nox emulator (probably there is no 64bit support, I will find out)
do you mean "BadukAI"? and which network? Can this network be pulled out of the apk? how to find?
the sizes of unpacked files "private" v17 and v18 - 198MB and 225MB. little space for another network
imagine after distributed training when the 40b policy reaches 9 dan and he optimizes that network. The kyu rank mode will be awesome. turning on the opening book, they play well all the way down to kyu rank 10. they go up to kyu rank -8. Cryptpark has tested against crazystone zero, i think it can already beat the 7d setting?
All of that is great, too bad is not open source so that other open source projects could use it... :|
what are the meaning of these files in the apk
10b.bin.gz
20b.bin.gz
15b.tflite
20b.tflite
40b.tflite
the 15b and 40b are for leela zero, they 20b is for the optimized katago, which is now faster than leela zero, setting numsearchthreads to 1.
answer @aki65 aki65/aki65.github.io#8 (comment)
to summarize for this thread, aki65 said there should be no change in invoking katago to use the tensorflow optimized net. Can someone run the optimized net in android command line and confirm? This net is really great for calibrated ai because it is so much faster. One can run the kyu rank bots on very old mobile processors now with instant moves; together with his opening book for the first moves, the kyu rank bot plays great and fast from kyu rank 10 and stronger. aki65's binary w/optimized net, .so, .cfg files: https://easyupload.io/yza296
Can someone run the optimized net in android command line and confirm?
I just tried by unzipping the files provided, using adb push
to upload them to an S9+, then doing adb shell
and running the binary with LD_LIBRARY_PATH=. ./katago_binary_android
. When providing no parameter it does work (prints the usage info), but as soon as I add any parameter (for example LD_LIBRARY_PATH=. ./katago_binary_android version
) it stops working. It just exits without any error.
I have spent some 3 hours during the weekend trying every which way, with different versions of the libraries command lines, etc. The same method worked for the old katago, leela zero and SAI. I must be missing something...
just saw the new icon in your app, it looks great! I still wish katago used half as many playouts though. or maybe have easy, medium and hard settings? Thanks for putting in all the time!
Plan A was to put in a scaling mechanism as the one in katrain, but I just can't find the time for such an undertaking.
Plan B was to basically have the app compute the number of playouts per second it gets and then scale the number of playouts for the next moves so as to keep the time per move to 1s on any device. This requires less development, but would make the AI be inconsistent.
Plan C would be to at least temporarily have a global setting in the settings page where you can tweak this to your heart's desire. Might go with this in the next version.
oh wow this is going to be great, thanks!
aki65 added support for the new distributed kata1 weights on android. https://github.com/aki65/aki65.github.io/releases
aki65 added support for the new distributed kata1 weights on android. https://github.com/aki65/aki65.github.io/releases
thanks
aki65 released optimized s580 40b distributed training weight. It is extremely, extremely fast. https://github.com/aki65/aki65.github.io/releases/tag/v1.4.1 In a quick test, the optimized s580 policy seems to be atleast as strong as the non optimized last 40b s509 net of the non distributed run. 6 games policy (t1 p1 nncache=2) against 20b 5 playouts (t1 p5), the optimized net was 4 wins - 2 losses, the non optimized s509 was 2 wins - 4 losses. The optimized net is almost 4 times faster and almost 1/4 the size of the s509, which is a really incredible accomplishment by lightvector, akigo, sanderland and all the people who contributed gpus to play 3,000,000 training games. wow.
Imagine if in 2018, when Leela Zero was first starting, if someone would have told you that 3 years later there would be an app that was able to play from 10kyu - 9d on a $100 smartphone, making it's moves almost instantly. And it could play with variable komi and no ladder weaknesses.
If we could also have that open-source, that would be a dream...
Sorry to bother you all after so long! I'm an amateur in programming and I'm not sure whether this Issue is discussing a full apk or something that just works? Because I can't stand LeelaZero only can play Go games on a 19x19 board, which is too big on a phone screen. Today, I try to compile KataGo in Ubuntu in Termux on Android, it works and works fine, all you need to do is to read the guide here: https://github.com/lightvector/KataGo/blob/master/Compiling.md#linux
I was able to compile for android with opencl support using my branch: https://github.com/jopdorp/KataGo/tree/android-support
now in the runtime when I try to use it, I get a crash during the tuning process:
No existing tuning parameters found or parseable or valid at: /data/user/0/nl.jopdorp.opengoban/files/.katago/opencltuning/tune11_gpuMaliG710r0p0_x19_y19_c128_mv8.txt Performing autotuning
......
Tuning hGemmWmma for convolutions error: couldn't allocate output register for constraint 'r'
@acristescu do you use OpenCL in the Sente app? If so how did you do the tuning, and would you have some tune files I could try?
Now that we have a CPU-only version of the engine, would it be possible to add arm7 and/or arm64 to the compiled binary of the release? I couldn't get it going, but then I haven't compiled any C++ since I finished uni many years ago :) It should be possible though as there are several of these floating around for both Leela Zero and SAI (for example https://github.com/Grant-Tao/compiled-leelaz-0.17-for-android-phones and https://github.com/evdwerf/leela-zero/tree/android ).
The second repo above even has the modifications to the Makefile for Android (see this comparison https://github.com/leela-zero/leela-zero/compare/next...evdwerf:android ).
Is that something that could be achieved?