Make Leela Queen of openings

jhorthos commented 5 years ago

Normal Leela training does not explore early game diversity broadly, presumably forcing her to use generalization to play unusual openings such as King's Gambit, Dutch, and many others (also, after the first few ply, any variant that isn't among those she has focused policy on).

fersbery has written a training branch of Lc0 that permits the use of forced opening diversity. I have started to test whether this can improve play by using the King's Gambit as a test case. In normal training, Leela quickly learns to avoid the defining King's Gambit move (2. f4) because it evaluates poorly for white, but that doesn't stop it from being used in tournaments along with many other unbalanced openings that Leela never trains on.

Starting with net T40.T8.610 (Sufi 15 network), I generated approximately 170,000 training games played from a complex King's Gambit book that includes all common 4-ply, 6-ply, 8-ply, and 10-ply variations (derived from Chad's opening books). These games were mixed with 1 million recent T40 standard pipeline games and used to further train from T40.T8.610 with an LR of 0.003 dropped after 4k steps to 0.001. To test whether play of King's Gambit openings specifically improved, I ran fixed node matches against the parent net with either random King's Gambit (KG) openings from Chad's 8-ply book, or random other openings from the same book (non-King's Gambit, NKG).

Match results (800 nodes):

   # PLAYER                        :  RATING  ERROR  POINTS  PLAYED   (%)    W    L     D  CFS(%)
   1 lc0.net.T15.3-swa-7000-KG     :    17.0   10.3  1048.0    2000    52  553  457   990     100
   2 lc0.net.T8.610                :     0.0   ----  1956.0    4000    49  852  940  2208      60
   3 lc0.net.T15.3-swa-7000-NKG    :    -1.4   10.4   996.0    2000    50  387  395  1218     ---

I tried many other variants of the scheme and they generally seem to gain about 20 Elo in King's Gambit without losing Elo in non KG openings.

Similar results were obtained in a preliminary test with a 128x10b net (CLR-aart-322k), except that Elo was approximately +40 for King's Gamit (perhaps smaller nets don't generalize as well?).

Planning to test Dutch Defense next as a second test, then might be interested in getting folks to contribute games. Most likely each class of openings will need well over 100k games each to get best results. Eventually I hope an entire opening book (aka "forced opening diversity suite", which sounds more zero) will be part of Leela training.

jhorthos commented 5 years ago

A later test with more aggressive LR in King's Gambit training suggests abut +30 Elo can be had by this method.

RollYrOwn commented 5 years ago

It's an idea I have mixed feelings about, being something of a zero-purist. But I think you can develop a better test case than the Dutch, given its pawn structure often resolving itself into KID or Stonewall type positions, both of which Leela handles fine.

The position that catches Leela in losses most often, and/or sees her attack suboptimally, is the Chigorin Defence. I'd be curious if you could improve results there without affecting overall positional understanding.

-Jesse Talbutt (PlasticIcon)

jhorthos commented 5 years ago

Ah, good advice - I had not thought of testing Chigorin. Thanks!

jhorthos commented 5 years ago

Any other useful tips on unusual openings she didn't seem to do well with? I am actually testing a fairly broad 4-ply book now - if it were made broad enough it would be very close to zero.

RollYrOwn commented 5 years ago

Let me think that over. You could always download a copy of ECO C and grep for "gambit". I'm kidding. Sort of. Depending on how ambitious you want to go, I have a feeling she doesn't quite get the open sicilian; I'm pretty sure relevant Sicilian games between SF and Leela would be in SF's favor.

Other than that, from whatI've seen, her generalization at this point is pretty good. Also she doesn't play competitively often enough to generate much data. It might be interesting to try to set up a series of very short games with Stockfish so that we're not basing our opinions on single games.

Anyway I'm including the four games I know of where Leela played the Chigorin - the two from the current bonus, and the two from the Cup finals. j

Leela-Chigorin.zip

jhorthos commented 5 years ago

All solid stuff - thanks again. I agree she doesn't seem to quite get open Sicilian though she apparently generalizes well enough to manage.

jhorthos commented 5 years ago

Big kudos to fersbery for making a requested modification to opening book training and fixing a bug all in record time!

jhorthos commented 5 years ago

I now have preliminary results from a general 4-ply book. I generated about 300k games using Chad's 4-ply book and CLR-aart-222K (aka Little Demon, the best 128x10b net I know of). When I further trained Little Demon with these games it gained approximately 25 Elo (1,992 800-node games, CFS 100%) with an 8-ply book for the match (different and longer than the training book).

Videodr0me commented 5 years ago

I think such a scheme is actually zero in regard to most modern chess engine competitions. These are NOT really chess competitions according to FIDE rules. Most of them start from predetermined openings forced upon the engines. This is a different game (slightly but surely in a game theoretic sense). As the choice of openings for these competitions is subject to human bias, it is only natural that in order to properly train (yes even in a zero way) for this game is to reflect these opening choices in leelas training. One could even go as far as to say that most traditional engines have an "unfair" advantage because their evaluation functions and search are most likely heavily tested and tuned on "interesting" position according to human selection. Also their testing regime already uses an opening book of predetermined positions.

So if we guess or come up with a reasonable way to predict the consideration set of these opening positions we could train better for this chess variant actually used in engine competitions. Some precautions should be taken though:

We should not introduce excessive bias in our prediction of opening positions
they should be chosen by statistical analysis of past competition opening choices, and historical opening prevalence weighted by recency (as recent choices are probably more refined)
any well defined process of chosing our training set will do, but it surely should not involve hand-picking lines
we should preserve leelas generalization abilities, by still training on the official starting position in a to be determined percentage of training games

That being said, we are doing quite well with our current approach and that has still a lot of potential for enhancements without resorting to schemes like these, but it can't hurt investigating this early and laying the groundwork.

RollYrOwn commented 5 years ago

Hey jhorthos, I'd like to help, but I only have a windows box at the moment, and the link given in the discord goes to version 71, which is a broken build. I tried the earliest version I could find with an executable to download, but it was PAINfully slow - far more so than there's any reason for it to be.

Also, I hate to nitpick but in the openings file there's no entry for 1.e4 ...c5 2.Nf3 ...d6. Which is an insanely popular move order, since you can't reach the najdorf with 2 ....nc6.

jhorthos commented 5 years ago

Huh, I just used Chad top 100 - made no attempt to critique what is there. In any case any eventual use for training will almost certainly be with a broader book - this is a test.

jhorthos commented 5 years ago

Thanks for your comments VidoedrOme, agree with all. If you know much about collecting and weighting competition choices that would be helpful. I have not made any attempt to do that.

jhorthos commented 5 years ago

RollYrOwn - that opening is in fact the second most common, and it is definitely in the opening file.

RollYrOwn commented 5 years ago

You're right - I don't know how I missed it. It's #2.

Naphthalin commented 4 years ago

As far as I can tell, this issue is resolved now in 2 ways:

as discussed in #541, #1060 and others support training from opening book
using --policy-softmax-temp=1.2 in training has made the need for an opening book to explore reasonable openings obsolete, even though it doesn't cover a huge number of KGA games.

mooskagh commented 4 years ago

I suggest the following plan:

Ask dkappe which openings FF plays better than Lc0 in his opinion (or look up what he says in forums).
Run a test FF vs Lc0 on this openings (if someone with testing ability has FF).
If Lc0 wins, officially declare Lc0 as queen of openings. Otherwise, optimize and keep this issue open.

mooskagh commented 3 years ago

This should be either closed or moved to discussions. Closing for now as last activity was long time ago, but if you feel it should be revived, it can be moved to discussions.

LeelaChessZero / lc0

Make Leela Queen of openings #872