CuriosAI / sai

SAI: a fork of Leela Zero with variable komi.
GNU General Public License v3.0
106 stars 11 forks source link

Feature explanation: advanced input planes #4

Open Vandertic opened 6 years ago

Vandertic commented 6 years ago

This feature is motivated by the fact that AlphaGo had more rich input planes than AlphaGo Zero and Alpha Zero, the latter two using only the board configuration for the last 8 turns. One could argue that AlphaGo was experimental, and that the other programs are its evolution, but it must be noted that even if they eventually overcame AlphaGo's strength statistically, they were never exposed to the public as was the first one, which was the only program trusted to play in public, against Lee Sedol, Ke Jie and, as Master, with several other professionals. We believe the reason may be that go playing programs made with this general approach are universally affected by sporadic weaknesses on rare game events, like ladders, huge no eyes groups, or sekis. Putting extra information in the input planes may mitigate these problems and AlphaGo had a huge amount of such information as input planes. One possible issue is that complex information deduced from the board configuration may sometimes be wrong. AlphaGo had an input field saying if a stone may be captured in a ladder. This is not obvious to compute with zero errors, and many people thought that when AG lost to Lee Sedol in game 4 it was because it was blind to a complicated ladder, because of an error in the implementation of this input field. To avoid such situations, while keeping some of the benefits from the advanced input features, we have chosen only two very simple additional bit fields:

In fact we observed in previous 7x7 runs that sometimes even very strong networks would lose with a much weaker one because of an unseen large group atari, and we felt that it could be easier to learn ko in this way. After experimentation with this feature, we can conclude that it allows to accelerate a lot the learning in the beginning. It makes not much difference when the nets are strong, but the occasional lost game against weak nets disappeared. Experiments were done shortening the history from 8 to 4 turns, in order to keep constant the number of input planes (8x2 = 4x4).

As for the implementation, nets that expect advanced input planes are coded with a version number 17 in the first line. The program automatically understands this and gather the correct information for playing. The line command option --adv_features instead changes the recording of the training information, putting the correct input planes in the output files. The training with TF is no different with or without this features, as long as the training data is the correct one. The option in config.py `WEIGHTS_FILE_VER = "17" # 1: LZ

17: 'advanced features'

` just writes 17 in the first line.

Nazgand commented 6 years ago

A possible input plane is `fake vertex'. Such a vertex would be illegal to play on, and would not be counted as a liberty by either color. This would allow the same network to play smaller board sizes by filling some outer rows and columns with fake vertexes. People could use this to play Go on boards that look like swiss cheese too. Looking at a game generated by a cross of fake vertices through the tengen making a combination of 4 9x9 games may be fun. https://github.com/gcp/leela-zero/issues/1835#issuecomment-434607118