huangeddie / GymGo

An environment of the board game Go using OpenAI's Gym API
165 stars 31 forks source link

Super ko #21

Open RohanM opened 2 years ago

RohanM commented 2 years ago

This PR is based on the following, which should be reviewed and merged first:

Initial discussion: #13

Implements positional super ko rule, which prevents a board position from being repeated, beyond simple take-back scenarios. This rule is particularly useful for small boards, where it's easy to create multi-step cycles.

The rule is disabled by default, since a) we don't want to change the default behaviour of the gym, and b) it significantly affects performance:

❯ python gym_go/tests/efficiency.py
100%|█████████████████████████████████████████████████████| 64/64 [00:00<00:00, 371.53it/s]
Lower bound: 0.003 AVG, 0.000 STD
100%|█████████████████████████████████████████████████████| 64/64 [00:00<00:00, 374.59it/s]
Lower bound (super ko): 0.003 AVG, 0.000 STD
100%|██████████████████████████████████████████████████████| 64/64 [00:01<00:00, 53.18it/s]
Ordered Trajs: 0.019 AVG, 0.000 STD
100%|██████████████████████████████████████████████████████| 64/64 [00:31<00:00,  2.03it/s]
Ordered Trajs (super ko): 0.493 AVG, 0.006 STD
100%|██████████████████████████████████████████████████████| 64/64 [00:32<00:00,  1.96it/s]
Rand Trajs w/ Children: 0.510 AVG SEC, 0.029 STD SEC, 161.0 AVG STEPS
100%|██████████████████████████████████████████████████████| 64/64 [01:10<00:00,  1.11s/it]
Rand Trajs w/ Children (super ko): 1.107 AVG SEC, 0.082 STD SEC, 161.0 AVG STEPS
.
----------------------------------------------------------------------
Ran 6 tests in 136.795s