pasky / pachi

A fairly strong Go/Baduk/Weiqi playing program
http://pachi.or.cz/
GNU General Public License v2.0
510 stars 116 forks source link

Instructions for cluster mode? #140

Open kcalmond opened 4 years ago

kcalmond commented 4 years ago

After reviewing the repo an issues I don't find anything that describes how to deploy Pachi into a cluster of nodes.

Specifically, as a hobby project I have created a Raspberry Pi4B cluster running Ubuntu 18.04 on each node, as a k3S cluster (Kubernetes). I'm looking for interesting apps/services to run on it. Since I'm a go enthusiast I thought running a gobot server on the cluster would be interesting project. So deployment mode here would be clustered Pachi, in containers. Any ideas?

lemonsqueeze commented 4 years ago

Hi, Sure, depending on what you have in mind there are lots of things you could do if you put these Raspberry Pis to work. Distributed Pachi will allow you to make a stronger (or faster) engine. Or if you're interested in making it play online you could power many instances as opposed to just one or two.

Distributed mode needs some work at the moment. Looks like I broke it with last gtp layer upgrade, but should be easy to fix. As for strength, expect to spend huge amounts of cpu to gain just one stone. For example at one point I was running Pachi on kgs with 180k playouts, it was just one stone stronger than now with 5-15k playouts. Patterns have improved since so it might be stronger now though. distributed/distributed.c has instructions on how to set it up. Basically you start it with:

pachi -e distributed slave_port=1234          # master node
pachi -e uct -g masterhost:1234 slave         # slaves

For online play with multiple instances (several bots each playing their own game) there is infrastructure you can use. kgs is easy but not very flexible: you setup one instance/account on each raspberry pi (maybe even two with FIFO lock enabled so they don't take the cpu at the same time). ogs lets you load-balance so you could build something where you accept as many games as you have cpu power for and dispatch game commands to each node. Fox also has a gtp client and I don't think Pachi is playing there, so could be an idea also, this server is huge !

lemonsqueeze commented 4 years ago

Here's the fix for the distributed engine: PR #141 Should work fine with it now.

kcalmond commented 4 years ago

@lemonsqueeze My draft process for trying this out below. Github newbie here: How do I download/run the version that includes your distributed patch? (I see the PR is still outstanding so I'm assuming current binary release will not include it?).

  1. download/run single node pachi on an Ubuntu host (to make sure I know how to run it and can connected a GTP client to it) Q: which GTP client should I use to smoke test?
  2. Repeat # 1 in docker - create dockerfile, image etc
  3. Test distributed mode in docker (run multiple containers on same host)
  4. Test distributed mode in k8s (deployment, helm chart etc.)
  5. ? k8s horizontal pod autoscaler for new game/session requests?
lemonsqueeze commented 4 years ago

Yes, you will need to build Pachi yourself for this, binary releases don't have the distributed engine built in.

lemonsqueeze commented 4 years ago

If you're new to Pachi i'd suggest playing with it in normal mode first to get familiar with it. You can bring it up in a gtp client like gogui or just type the commands directly:

$ ./pachi
boardsize 19
clear_board
genmove b

should make it generate a move for example.

Then you can try the distributed engine on the same machine:

./pachi -t =4000  -e distributed slave_port=1234

And in another terminal:

./pachi  -g localhost:1234 slave

Now if you type commands in the first window it should get the slave to generate a move.

kcalmond commented 4 years ago

Thx again @lemonsqueeze 👍 . The make ran fine on my ubuntu 18.04 system. make install too, however I had to sudo that command. When running make install-data I got the warnings below on stdout. As a guess I tried make install-data again after unzipping the detlef54 data files into /usr/local/share/pachi. Output from make install-data produced same output:

~/git/pachi  master ?1  sudo make install-data
/usr/bin/install -d /usr/share/pachi-go
/usr/bin/install patterns_mm.gamma /usr/share/pachi-go/
/usr/bin/install patterns_mm.spat /usr/share/pachi-go/
WARNING: book.dat datafile is missing
WARNING: golast19.prototxt datafile is missing
WARNING: golast.trained datafile is missing
/usr/bin/install joseki19.gtp /usr/share/pachi-go/

Tried starting pachi after above. Looks like I need to put dcnn files somewhere:

 ~/git/pachi  master ?1  Pachi 12.50 (Jowa)  
git 44a9f917 (master)
haswell dcnn build, Jul  4 2020

Random seed: 1593907223
Scoring: using mcts (possibly inaccurate)
Loading dcnn files: detlef54.prototxt, detlef54.trained
Couldn't find dcnn files, aborting.

[1]  + 2133 exit 1     pachi

Re not finding dcnn files above, I guessed copying the detlef54 files I put into /usr/local/share/pachi into /usr/share/pachi-go might help:

 ~/git/pachi  master ?1  sudo cp /usr/local/share/pachi/* /usr/share/pachi-go
 ~/git/pachi  master ?1  pachi &
[1] 2138
 ~/git/pachi  master ?1  Pachi 12.50 (Jowa)
git 44a9f917 (master)
haswell dcnn build, Jul  4 2020

Random seed: 1593907336
Scoring: using mcts (possibly inaccurate)
F0704 16:29:22.085989  2138 io.cpp:36] Check failed: fd != -1 (-1 vs. -1) File not found: /usr/share/pachi-go/detlef54.prototxt
*** Check failure stack trace: ***
    @     0x7f5cfa1bb0cd  google::LogMessage::Fail()
    @     0x7f5cfa1bcf33  google::LogMessage::SendToLog()
    @     0x7f5cfa1bac28  google::LogMessage::Flush()
    @     0x7f5cfa1bd999  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f5cfa886f91  caffe::ReadProtoFromTextFile()
    @     0x7f5cfa8a4d26  caffe::ReadNetParamsFromTextFileOrDie()
    @     0x7f5cfa840cb8  caffe::Net<>::Net()
    @     0x55583c7a4030  (unknown)
    @     0x55583c7a443e  (unknown)
    @     0x55583c7a783f  (unknown)
    @     0x55583c7e2fde  (unknown)
    @     0x55583c7e33b3  (unknown)
    @     0x55583c7a3487  (unknown)
    @     0x7f5cf9281b97  __libc_start_main
    @     0x55583c7a3e2a  (unknown)

[1]  + 2138 abort (core dumped)  pachi

After fixing permissions on /usr/share/pachi-go/detlef54.prototxt...

 ~/git/pachi  master ?1  sudo chmod 755 /usr/share/pachi-go/*  
 ~/git/pachi  master ?1  pachi &  
[1] 2156
 ~/git/pachi  master ?1  Pachi 12.50 (Jowa)
git 44a9f917 (master)
haswell dcnn build, Jul  4 2020

Random seed: 1593907535
Scoring: using mcts (possibly inaccurate)
Loaded Detlef's 54% dcnn for 19x19
Loaded spatial dictionary of 18016 patterns.
Loaded 18101 gammas.
Checking gammas ... OK
Threads: 2

[1]  + 2156 suspended (tty input)  pachi

Looks like I've got something running now. Nice recommendation - opening at the 4-4 point :-)

 ~/git/pachi  master ?1  ps                                  
  PID TTY          TIME CMD
 1812 pts/0    00:00:01 zsh
 1833 pts/0    00:00:00 zsh
 1840 pts/0    00:00:00 gitstatusd-linu
 2156 pts/0    00:00:00 pachi
 2159 pts/0    00:00:00 ps
 ~/git/pachi  master ?1  fg                             
[1]  + 2156 continued  pachi
boardsize 19
clear_board
genmove bIN: boardsize 19
=

IN: clear_board
=
help
IN: genmove bhelp
Move:   0  Komi: 0.0  Handicap: 0  Captures B: 0 W: 0
      A B C D E F G H J K L M N O P Q R S T        A B C D E F G H J K L M N O P Q R S T
    +---------------------------------------+    +---------------------------------------+
 19 | . . . . . . . . . . . . . . . . . . . | 19 | . . . . . . . . . . . . . . . . . . . |
 18 | . . . . . . . . . . . . . . . . . . . | 18 | . . . . . . . . . . . . . . . . . . . |
 17 | . . . . . . . . . . . . . . . . . . . | 17 | . . . . . . . . . . . . . . . . . . . |
 16 | . . . . . . . . . . . . . . . . . . . | 16 | . . . . . . . . . . . . . . . . . . . |
 15 | . . . . . . . . . . . . . . . . . . . | 15 | . . . . . . . . . . . . . . . . . . . |
 14 | . . . . . . . . . . . . . . . . . . . | 14 | . . . . . . . . . . . . . . . . . . . |
 13 | . . . . . . . . . . . . . . . . . . . | 13 | . . . . . . . . . . . . . . . . . . . |
 12 | . . . . . . . . . . . . . . . . . . . | 12 | . . . . . . . . . . . . . . . . . . . |
 11 | . . . . . . . . . . . . . . . . . . . | 11 | . . . . . . . . . . . . . . . . . . . |
 10 | . . . . . . . . . . . . . . . . . . . | 10 | . . . . . . . . . . . . . . . . . . . |
  9 | . . . . . . . . . . . . . . . . . . . |  9 | . . . . . . . . . . . . . . . . . . . |
  8 | . . . . . . . . . . . . . . . . . . . |  8 | . . . . . . . . . . . . . . . . . . . |
  7 | . . . . . . . . . . . . . . . . . . . |  7 | . . . . . . . . . . . . . . . . . . . |
  6 | . . . . . . . . . . . . . . . . . . . |  6 | . . . . . . . . . . . . . . . . . . . |
  5 | . . . . . . . . . . . . . . . . . . . |  5 | . . . . . . . . . . . . . . . . . . . |
  4 | . . . . . . . . . . . . . . . . . . . |  4 | . . . . . . . . . . . . . . . . . . . |
  3 | . . . . . . . . . . . . . . . . . . . |  3 | . . . . . . . . . . . . . . . . . . . |
  2 | . . . . . . . . . . . . . . . . . . . |  2 | . . . . . . . . . . . . . . . . . . . |
  1 | . . . . . . . . . . . . . . . . . . . |  1 | . . . . . . . . . . . . . . . . . . . |
    +---------------------------------------+    +---------------------------------------+

desired 9.00, worst 10.00, clock [1] 0.00 + 10.00/1*1, lag 2.00
mcowner 0.09s
dcnn in 0.09s
dcnn = [ Q16 Q17 D16 Q14 R16 P17 K10 C16 R17 D4  D17 P16 P15 Q4  R15 Q15 O16 K11 N13 C15 ]
       [ 56  23  6   2   2   1   1   0   0   0   0   0   0   0   0   0   0   0   0   0   ]
[1000] best 50.3% xkomi 7.5 | seq Q16 C16  Q4     | can b Q16(50.3) Q17(47.3)
[2000] best 48.2% xkomi 7.5 | seq Q16  D4 D16 C14 | can b Q16(48.2) Q17(49.7)
[3000] best 47.9% xkomi 7.5 | seq Q16  D4 D16 C14 | can b Q16(47.9) Q17(50.6)
[4000] best 48.2% xkomi 7.5 | seq Q16  D4 D16  Q4 | can b Q16(48.2) Q17(50.4)
[5000] best 47.7% xkomi 7.5 | seq Q16 D16  Q4  D4 | can b Q16(47.7) Q17(49.3)
[6000] best 47.7% xkomi 7.5 | seq Q16 D16  C4  Q4 | can b Q16(47.7) Q17(49.6)
[7000] best 48.0% xkomi 7.5 | seq Q16 D16  C4  E4 | can b Q16(48.0) Q17(49.4)
[8000] best 48.0% xkomi 7.5 | seq Q16 D16  C4  E4 | can b Q16(48.0) Q17(49.7)
[9000] best 47.8% xkomi 7.5 | seq Q16 D16  C4  E4 | can b Q16(47.8) Q17(49.5)
[10000] best 47.7% xkomi 7.5 | seq Q16 D16  C4  E4 | can b Q16(47.7) Q17(49.3)
[11000] best 47.9% xkomi 7.5 | seq Q16 D16  C4  E4 | can b Q16(47.9) Q17(49.1)
[12000] best 48.0% xkomi 7.5 | seq Q16 D16  C4  E4 | can b Q16(48.0) Q17(49.1)
[13000] best 48.0% xkomi 7.5 | seq Q16 D16  C4  E4 | can b Q16(48.0) Q17(49.2)
[14000] best 48.4% xkomi 7.5 | seq Q16 D16  C4  E4 | can b Q16(48.4) Q17(49.2)
[15000] best 48.6% xkomi 7.5 | seq Q16 D16  D4  Q4 | can b Q16(48.6) Q17(49.1)
[16000] best 48.7% xkomi 7.5 | seq Q16 D16  D4  Q4 | can b Q16(48.7) Q17(49.1)
[17000] best 48.7% xkomi 7.5 | seq Q16 D16  D4  Q4 | can b Q16(48.7) Q17(49.0)
[18000] best 48.6% xkomi 7.5 | seq Q16 D16  D4  Q4 | can b Q16(48.6) Q17(49.2)
[19000] best 48.6% xkomi 7.5 | seq Q16 D16  D4  Q4 | can b Q16(48.6) Q17(49.3)
[20000] best 48.5% xkomi 7.5 | seq Q16 D16  D4  Q4 | can b Q16(48.5) Q17(49.2)
[21000] best 48.7% xkomi 7.5 | seq Q16 D16  D4  Q4 | can b Q16(48.7) Q17(49.1)
[22000] best 48.8% xkomi 7.5 | seq Q16 D16  D4  Q4 | can b Q16(48.8) Q17(49.0)
[23000] best 48.8% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(48.8) Q17(48.9)
[24000] best 48.8% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(48.8) Q17(48.9)
[25000] best 48.7% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(48.7) Q17(48.8)
[26000] best 48.9% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(48.9) Q17(48.9)
[27000] best 48.9% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(48.9) Q17(48.9)
[28000] best 49.0% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(49.0) Q17(48.9) P17(46.9)
[29000] best 49.0% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(49.0) Q17(49.0) P17(47.0)
[30000] best 49.0% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(49.0) Q17(49.0) P17(46.2)
[31000] best 49.0% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(49.0) Q17(48.9) P17(46.3)
[32000] best 49.0% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(49.0) Q17(48.9) K10(51.6) P17(46.3)
[33000] best 48.9% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(48.9) Q17(48.9) K10(50.9) P17(47.3)
[34000] best 48.9% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(48.9) Q17(48.9) K10(50.7) P17(47.3)
[35000] best 48.9% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(48.9) Q17(48.9) K10(50.6) P17(48.4)
[36000] best 48.8% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(48.8) Q17(48.9) K10(50.6) P17(48.3)
[37000] best 48.9% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(48.9) Q17(48.9) K10(50.5) P17(48.3)
[38000] best 49.0% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(49.0) Q17(48.9) K10(50.4) P17(48.3)
[39000] best 49.0% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(49.0) Q17(48.8) K10(50.3) P17(48.0)
[40000] best 48.9% xkomi 7.5 | seq Q16 D16 B15 B16 | can b Q16(48.9) Q17(48.7) K10(50.2) P17(48.1)
[41000] best 48.9% xkomi 7.5 | seq Q16 D16 B15 B16 | can b Q16(48.9) Q17(48.9) K10(50.2) P17(48.1)
[42000] best 48.9% xkomi 7.5 | seq Q16 D16 B15 B16 | can b Q16(48.9) Q17(48.8) K10(50.0) P17(47.9)
(avg score -1.746079/42781; dynkomi's -1.746079/42781 value 0.488803/42781)
*** WINNER is Q16 with score 0.4896 (26174/42781:42781/42781 games), extra komi 7.500000
genmove in 8.09s, mcts 7.92s (5403 games/s, 2701 games/s/thread)
[42781] best 49.0% xkomi 7.5 | seq Q16 D16  D4  C9 | can b Q16(49.0) Q17(48.7) K10(50.0) P17(47.8)
tree pruned in 0.046s, prev 0.0s ago, dest depth 14 wanted 3, size 79089736->41943528/41943040, playouts 26174
temp tree overflow, max_tree_size 209715200, pruning_threshold 20971520
Move:   1  Komi: 0.0  Handicap: 0  Captures B: 0 W: 0  Score Est: B+3.0
      A B C D E F G H J K L M N O P Q R S T        A B C D E F G H J K L M N O P Q R S T
    +---------------------------------------+    +---------------------------------------+
 19 | . . . . . . . . . . . . . . . . . . . | 19 | , , , , , , , , , , , , , , , , , , , |
 18 | . . . . . . . . . . . . . . . . . . . | 18 | , , , , , , , , , , , , , , , , , , , |
 17 | . . . . . . . . . . . . . . . . . . . | 17 | , , , , , , , , , , , , , , , x x , , |
 16 | . . . . . . . . . . . . . . . X). . . | 16 | , , , , , , , , , , , , , , , x , , , |
 15 | . . . . . . . . . . . . . . . . . . . | 15 | , , , , , , , , , , , , , , , , , , , |
 14 | . . . . . . . . . . . . . . . . . . . | 14 | , , , , , , , , , , , , , , , , , , , |
 13 | . . . . . . . . . . . . . . . . . . . | 13 | , , , , , , , , , , , , , , , , , , , |
 12 | . . . . . . . . . . . . . . . . . . . | 12 | , , , , , , , , , , , , , , , , , , , |
 11 | . . . . . . . . . . . . . . . . . . . | 11 | , , , , , , , , , , , , , , , , , , , |
 10 | . . . . . . . . . . . . . . . . . . . | 10 | , , , , , , , , , , , , , , , , , , , |
  9 | . . . . . . . . . . . . . . . . . . . |  9 | , , , , , , , , , , , , , , , , , , , |
  8 | . . . . . . . . . . . . . . . . . . . |  8 | , , , , , , , , , , , , , , , , , , , |
  7 | . . . . . . . . . . . . . . . . . . . |  7 | , , , , , , , , , , , , , , , , , , , |
  6 | . . . . . . . . . . . . . . . . . . . |  6 | , , , , , , , , , , , , , , , , , , , |
  5 | . . . . . . . . . . . . . . . . . . . |  5 | , , , , , , , , , , , , , , , , , , , |
  4 | . . . . . . . . . . . . . . . . . . . |  4 | , , , , , , , , , , , , , , , , , , , |
  3 | . . . . . . . . . . . . . . . . . . . |  3 | , , , , , , , , , , , , , , , , , , , |
  2 | . . . . . . . . . . . . . . . . . . . |  2 | , , , , , , , , , , , , , , , , , , , |
  1 | . . . . . . . . . . . . . . . . . . . |  1 | , , , , , , , , , , , , , , , , , , , |
    +---------------------------------------+    +---------------------------------------+

= Q16
lemonsqueeze commented 4 years ago

Yes that looks good, well done ! By the way you don't really need to install anything (the sudo part) at this point. If you copy the data files in the directory where you built pachi you can just run everything from there: ./pachi or the full path, say ~/src/pachi/pachi if the build is in ~/src/pachi. But having them in /usr/share/pachi-go won't hurt either.

lemonsqueeze commented 4 years ago

Distributed Engine Fix PR #141 has been merged, so now you can clone main repository to play with distributed engine :

git clone https://github.com/pasky/pachi