pytorch / ELF

ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation
Other
3.37k stars 566 forks source link

which config used to defeat leela zero? #23

Closed maxim28 closed 6 years ago

maxim28 commented 6 years ago

Hi, i have successfully ran the opengo bot by the following command which use 4096*2=8192 rollouts:

./gtp.sh ./pretrained-go-19x19-v0.bin --verbose --gpu 0 --num_block 20 --dim 224 --mcts_puct 1.50 --batchsize 16 --mcts_rollout_per_batch 16 --mcts_threads 2 --mcts_rollout_per_thread 4096 --resign_thres 0.05 --mcts_virtual_loss 1

On the other hand, i also started a leela bot with model 158603eb with the following command which also use 8192 rollouts: src/leelaz -p 8192 -v 0 -r 5 --timemanage off -t 1 -d --noponder --gpu 1 --weights 158603eb61a1e5e9dcd1aee157d813063292ae68fbc8fcd24502ae7daf4d7948 --gtp

I put all the 2 bots on my private cgos and found opengo cannot defeat leela zero in recent 3 games, which is so weird for me because the author of leela zero claimed that leela-elf(leela bot which use elf weights) beated leela-zero by 167:18

leelaz-18e6 v leelaz-elf (185/1000 games)
board size: 19   komi: 7.5
              wins              black         white       avg cpu
leelaz-18e6     18  9.73%       9   9.68%     9   9.78%     81.09
leelaz-elf     167 90.27%       83 90.22%     84 90.32%    127.66

So i am curious if my config of opengo is wrong? how to set the config so as to defeat leela zero?

MartinVingerhoets commented 6 years ago

The weight ELF used to win 980:18 was an older weight(158603eb), so it's not weird that a newer weight wins more games. They also capped the rollouts at 40000, wich used around 50 seconds a move on a V100.

jma127 commented 6 years ago

Thanks for your interest! That looks pretty similar to our LZ/pro game configs, although I'd have to double check to be sure. The only change would be to the rollout count for ELF OpenGo -- we used 40000 per thread.

maxim28 commented 6 years ago

@jma127 i found the root cause finally. it is leaky relu flag...

i cloned the opengo code 2 days ago, at that time, leaky relu flag is on in gtp.sh, so when i start the opengo bot, i found it played so badly that CANNOT defeat leela zero. just now i found you submitted a commit (Remove leaky relu flag in script), when i updated the repo and start opengo again, it played well now.

dylandjian commented 6 years ago

There is something I don't quite understand : when you say 40k rollouts per thread, you mean 40k simulations from root to leaf or do you mean something else ? If it is the case, why that many simulations ? To me it means that it would take at LEAST 10 seconds per move, which doesn't seem right, or perhaps you recommend using that many simulations while playing, but used a different number (way lower) during training ?

dylandjian commented 6 years ago

Answered here