lightvector / KataGo

GTP engine and self-play learning in Go
https://katagotraining.org/
Other
3.52k stars 563 forks source link

Is it okay to have only a current board position but no previous moves? #364

Open fallcat opened 3 years ago

fallcat commented 3 years ago

I'm trying to extract features only given a current board position and the last move. I saw that the binary input features include five channels that have previous five moves' position. Is it okay to leave rest of the previous moves blank and global input features' pass channels blank (because the players are not passing) and still get a reasonable result? (I'm looking at the python code)

lightvector commented 3 years ago

Yes, and during training different numbers of them are set to be all zero a small percentage of the time specifically to guarantee that the neural net has experience with having arbitrary amounts of history or not, and still having to make the same outputs and predictions regardless.

You'll need to do a similar thing on the ladder planes, but there, the correct thing to do with the ladder planes is to copy them backwards (make them duplicates of the most recent plane that you do have available) instead of zeroing them out.

Having more history is better than not having it if you can manage it though - it improves the prediction quality and also focuses the search down more relevant and local lines, if you're doing MCTS search. I did measure once the impact of turning off history inputs entirely, and even though the neural net is trained to be able to make predictions without it, for the full engine it was a loss of more than 100 Elo, maybe more than 200 Elo, I forget exactly. It will be much, much smaller though if you still do have the last move (the last move should be far more important than the ones before it).

OmnipotentEntity commented 3 years ago

Do you believe that a prediction output of the most recent N moves might be a useful target for bridging the gap in performance between these two situations?

fallcat commented 3 years ago

Thank you for the explanation! Sorry I didn't missed this earlier but our project is due today. I'll definitely try that later. We do keep the last move but we just add the previous moves randomly instead of leaving it blank. I might try how that works by making it blank.

lightvector commented 3 years ago

Haha, randomly? Ummm, okay. :)

Well, now you know. The proper encoding for KataGo for "don't have history info" (which is explicitly trained on all the time, and used often in the actual full engine when people provide analysis engine positions and don't supply history for that position) - is: all-zero move history ("for each location, there was not a known recent move at that location"), and backward-duplicated ladder status history ("for each group, there is not a known recent change in that group's ladderability").

If you do anything else, you're playing with fire, just a little. The neural net will probably still say something reasonable, but it will be doing so via pure extrapolation, since you'll be telling it an input of a format that probably it has never seen in its training.

lightvector commented 3 years ago

Which is not to say that pure-extrapolation will have any issues, it might be fine. It plays on 21x21 superhumanly (almost certainly) despite having no training on that board size. :)

By the way, @fallcat - I'm curious what your project is, if you're able to share. Is this some university research project? Whatever it is, best of luck. :)

lightvector commented 3 years ago

@OmnipotentEntity - maybe, but having the most recent N moves is a channel for real new information to get into the net that would be costly to compute on its own, just like telling the net about liberties or ladders, so it makes perfect sense why it would help and is not entirely replaceable.

Most of the time (both in the training dataset, and on average at random points within an MCTS tree), the recent N moves will be higher-quality moves than the net would have determined on its own. They will be moves that raw net + search played. Raw net + search, even with very low playouts, is much stronger than raw net. Information about what a much-stronger-than-you agent wanted to play on slightly earlier timesteps is useful information.

Also, you can condition on the fact that the recent moves were not anywhere else. For example, suppose you are comfortably winning, and are in the kind of positions where the only way you can lose is if somehow one of your groups is destabilized and attacked/killed. Well, if you know where the last move or moves are, you know they weren't anywhere else - i.e. places where they didn't move are less likely to be unstable, because, well, there aren't new moves in those places. From a Bayesian perspective, seems like this is also something that could give you a potentially useful prior on the board status, even if it also isn't super-reliable.

fallcat commented 3 years ago

@lightvector Not really randomly but just adding from top left to bottom right....

That might make sense to explain why our result is so bad lol. Because the input is not what it's trained on...

Our project is for our university's machine learning class. It's not complete yet but it's here. https://github.com/fallcat/go-review-matcher/ Thank you for the interest! I might try to use the correct input you mentioned in winter break myself.

fallcat commented 3 years ago

Also the step of converting the board to bin_input_data takes so long lol. I looked at the code to see if there is anyway to speed it up but there are too many complicated for loops so I guess it's impossible. So I ended up storing them in advance. (can't store the whole intermediate layers because they are too large)