Zeta36 / chess-alpha-zero

Chess reinforcement learning by AlphaGo Zero methods.
MIT License
2.13k stars 479 forks source link

Second "good" results #37

Open Akababa opened 6 years ago

Akababa commented 6 years ago

By training the model with a similar config as in this repo (7 residual blocks of 256 depth) on FICS games, I think it's not doing too bad: (I played white, model gets 1200 sims/move)

image

The model and weights are in my fork, please feel free to clone and try for yourself (It's not compatible with this repo)

What are everyone's thoughts on this? Should I keep training this or start to scale up to more blocks? There are a total of 7*2+1=15 convolutions now, barely enough to traverse the board and back, so this might be a problem with respect to long-range tactics.

Zeta36 commented 6 years ago

It looks very promising!! Please stay training for a while so we can check if the model is able to finish a gane without blunder!!

Akababa commented 6 years ago

Thanks for the encouragement! The tactics is a worry but at least it's fun to play with (very aggressive lol).

What do you think of the idea of adding "knight-shaped" and "queen-shaped" convolutions? Is it too much "cheating"?

Also I'm sure it can play without blunder if we give it enough sims, but the question is, is it validated yet? And how will we know when we can move forward?

Zeta36 commented 6 years ago

@Akababa Do you mind if I copy your results into my repository?

Akababa commented 6 years ago

Sure, go for it! :)

I'll continue experimenting with 5x5 convolutions, tell me if you have any thoughts on this.

Zeta36 commented 6 years ago

@Akababa, please can you make me over here a little resume of your improvements and main diferencies against my current master branch in order to mention it in the readme section? Changes in model feed input (the planes you use, etc.), changes in the workers, in the player_chess.py file, and so on.

Thank you!!

Akababa commented 6 years ago

Sure, I put it on my wiki to make it easier to update in the future. (LMK if it could be more detailed/improved!)

Akababa commented 6 years ago

I had some better results today, although I don't think it can get much better in its current form.

apronusdiagram1514071757

ako1983 commented 6 years ago

Who's white?

Sent from my Verizon 4G LTE Droid On Dec 23, 2017 6:31 PM, Michael Pang notifications@github.com wrote:

I had some better results today, although I don't think it can get much better in its current form.

[apronusdiagram1514071757]https://secure-web.cisco.com/1PTfUVHbV28XFEVn0_kOXqaDQb7Tua92edI1IelXsB_3F7YJD1AP9sSRKK59drEYUYbyOtpqRsLrHM-7PyxMjtttBID6DsUcY00Q4tKgfr1FQYsVbaioxlSABjjCBEFZ8ckO8_K0VHfSV0JWZjdv6oIJLdj1SL9lZDA791-D-mQkkMh3A1yvFZg1DTxg_0FA6OM5_g-GvaBVw2rlrHeXHCWzo1hX0jweRHJX2f96q0JDmtNhv9_kPDgyBZnj15TvMsfUXtQKr-W8VtfaOXuZUI5G80wIIOChceFdyZya06lXvKRSvwsnPDzFv4GlPsh1YfP4P14QfTT4DrfRZsIg1KOxoV56ccbZv_CR2HlwfoPdldBUx4xzci1l0mOWSWVE3/https%3A%2F%2Fuser-images.githubusercontent.com%2F4205182%2F34323276-ecd2a7b6-e806-11e7-856a-4e2394bd75df.gif

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://secure-web.cisco.com/1sWKGmF_E0jKNhXph9O3iW6kthrKze472vMe4_dMpPvEdtIhE103zE3o1TQ0Pvn4qh8LoPk2xbND9hcC4a9FvJGOnoep7YZBBKobFaZ5GWZT2McF_PyVXNbE7i38WD6ZiDRwWL3Hs5BbYKB_63ahQmqAyGmj0YZEW0It7zv0399zF9Bu-kiM2_kKImoTecxCv99LiwuE2Kwb5b6OZnEOsxhQPnXyfigJGfRi0XmSzXETDNDL1ggQ-FeziQZDpBZcABtnY-R2-CvMbMrF2d_ULzCLX5jTYRRKG8_FNaMXg7vFU_m1ZVHIT4PMCJFT0WNmv2HisdOUazhoxwmzPa3SvPkBchZXSarPA_o3hVysRiaFx2WcFi6r7moBnn9XHrM0k/https%3A%2F%2Fgithub.com%2FZeta36%2Fchess-alpha-zero%2Fissues%2F37%23issuecomment-353754163, or mute the threadhttps://secure-web.cisco.com/1Rz3N2_irc79af9FIE5B1yw7zvLlBh2B1iAFJG37vd7mqqi9q1yzoE7wCh_TozU7Ug89nv3Gh9vEyF_fRTJFJmJMKXi_zkhQKONO9kUp50e0TRCLrgWLcTx1M1m56I8XbbJOmDH6iG6iVh2eBpb9Omn5pZy14V6S5C3TjjvrbPZSfZgIwgxMomdx6GEgQtjGy62QqJdQss8fJ3lQxH57EelhqJppLKKpUn4M6O-eF9Y0GtHX3w2AREBcVjizHowBlxum4V8y45yutCCxq6SOx4qgPF_B_Qj4nXMYU0CaNgdVhMEImfaE0cMo7awg_YnTeohKPdwfWNdqHTq7CKijvebR_bqbT0_SXi_pTb6nD8d833WD3eSi5nd_RaVdBa4c6/https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAJmd3F539hfFFSmHMwz5knqJJSM-lZyNks5tDY04gaJpZM4RLSYA.

Akababa commented 6 years ago

@ako1983 Lol, I'm playing white. Is my chess ability really that bad? :)

ako1983 commented 6 years ago

No, you crashed it very nice! What is the rating of the engine?

Sent from my Verizon 4G LTE Droid On Dec 23, 2017 10:24 PM, Michael Pang notifications@github.com wrote:

@ako1983https://secure-web.cisco.com/1nF8I0IKzUg1aZi47T10gEIJsEZX7pHlyEkX95RlVh5mCt1_tax1dBjKB_ejJaGXvF7TTyZumG0LwGq4Gd3lm4Tzb-ZjSHOJf0Qwz6ZVdQPGDGS18DzY8nZEpZnryoEE94T267USwYB5KxWOMHLPX2NQR7l9MeOu0Zf774QIr_Z-gSoPW_c6NHCqwET6rrLjIJYIZdHUNxx02-_tahPZPzX3kWvtuTEl8rciWk2hf4aTyUcoIKt7UO2Yr1M7rHqE2WABqhtlG0FRs7Oi7amJDWfeMIWh7UB1vf2Ta_zKN1IDlUmX62BkXoCrFft82eCYJ55lVPVkI4RUi9hpl7_KsgpdYkS-uBahZg1rnylZELx6osHZMbokXy7Upo9AFZ_3bDlA2wX5hcYMX8K7_W6CQVK8f6Wg6c0RVAn6QgyYFPoox-8dgKv0RnXIbx5QI2ef7/https%3A%2F%2Fgithub.com%2Fako1983 Lol, I'm playing white. Is my chess ability really that bad? :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://secure-web.cisco.com/1M-L-YkvZEtblJRx6ftH6bhOfjmsAmisGCac_GtMZ2itAUmhgYReTai2D_Qepxezkz2KSc6dr--gOEuKBV-xWpCmQRrRV_y1o0HD66h4e0A1lVLCKfWv4HFFYE3-q5dtR_qj1KidzeutrM-1cCHLgqPFFCwRIPMOEnrpPIingzs2WSve4ba_3jDsy1TapAwhRGL9FQjJBbuflCOKqZyI8GLMHqnnSS2GYnfes5jgBto29NkMtJg4ln3dm9IOwCZBW5b2i3GeYrIb08264RYpgZ-CosAn_mykcuygiOaSXLiItZU4skd1gbGCwfvXESDDSeicPy95DSGBeNyFRK0Fc6qM-zBcRDEZ146Tj-kFdCjVQllVCo21h9uLxtleFDea5kDuFWcSqZ-HV-TIxoUz2i4bbdyd8qo48Btq5CkdWTJXiKtnik_tLiZWfQGVF4eJ9/https%3A%2F%2Fgithub.com%2FZeta36%2Fchess-alpha-zero%2Fissues%2F37%23issuecomment-353763007, or mute the threadhttps://secure-web.cisco.com/1mqS9RsO2zPs4OEB7W2E7M9g-WtuK8s4h_g6nuzW2OlRgjW4beMOmvsHWI-gAw3tU19I99Zz3A34nDmM3ODTuKgtnWRrUnvBfVNUJH3fdpXE5mNfhTGExdUBiI6kVLzUk01NlXpbXNbylRG7CiQN3oTjon-6fSoBK2ir6KeL-V1Kplgn! %20RfQfY5cd_WA8nvcw1aqP4sl1OPKV83Vs_GZrSZm1SJjxSt7r4QQ8_h8-4Byqc1EGQomNKzjIpaXSbcD2bsmfJftNjklq8IcQLLhkdzTGmRKawANbGkUoBJOrpbyDrA0mfQG10oLqjRidC4BffrVHkr62CfHrqQoWfSPm_WpFqfHlaebYEuxnI0BiHwG1ycxV6sN9N3RxJC5oVC7ISL4IzUCVKqiWqLtDaPBQFiYbUgawXIcieSfwupM0TCa4EZHeEoluA3iCvMyom6dOE/https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAJmd3GF1OygVhv-h_9Amtu1EgrGhtHknks5tDcPpgaJpZM4RLSYA.

Akababa commented 6 years ago

I guess around 1300 for now. I want to try out different architectures but training on my laptop is very slow!

Zeta36 commented 6 years ago

I'd like to know how many iterations did you train that model. What was the loss you reached? Did you train with a fixed SL number of files or did you left the SL worker active and changing constantly the playdata files?

About the limitations of the current model, maybe you could try with a higher "res_layer_num" in the configuration (I think DeepMind used 19).

Zeta36 commented 6 years ago

I created a new branch with your code (nohistory branch), @Akababa. Unfortunately I cannot merge it since I'm having some troubles with the 'sl' worker:

2017-12-24 09:23:32,822@chess.pgn ERROR # error during pgn parsing
Traceback (most recent call last):
  File "C:\Users\Samu\Anaconda3\lib\site-packages\chess\pgn.py", line 965, in read_game
    move = board_stack[-1].parse_san(token)
  File "C:\Users\Samu\Anaconda3\lib\site-packages\chess\__init__.py", line 2595, in parse_san
    raise ValueError("illegal san: {0} in {1}".format(repr(san), self.fen()))
ValueError: illegal san: 'h6' in 3rr1k1/p1pq1ppp/Npn3b1/8/3PnQ2/P4PN1/BP4PP/2R2RK1 w - - 4 23

Moreover, the 'self' worker is getting stalled for me and the best model you uploaded to your branch after playing with it in Arena is not so good as I thought (although I know it's much better than mine). Maybe you have better weights you did not upload or something.

If you can please review and make the necessary changes in the branch and maybe upload your latest weights, you are really welcome. As soon as I can see all perfectly working I will merge into master.

Thank you for your work.

P.D. I removed 'ujson' package since I'm getting errors to install it in Windows and I think it's not so fundamental. I also cleaned up a little the source code.

Zeta36 commented 6 years ago

Reviewing the model you uploaded I've found it plays really well. It's a big improvement with respect we've reached until now. If you please can take a look into the branch I created: "akababa-changes" and let it working fine 'sl' and 'self' workers I'll merge automatically :).

Akababa commented 6 years ago

Thanks for the code review @Zeta36!

Could you please post the pgn file which is giving you that error with SL? And that looks like maybe a pgn fomat error or chess.pgn bug.

What is the specific error with "self" worker?

Finally quick update on the model: I found that I wasn't able to train it further, so I experimented with 5x5 yesterday and the results weren't great. I will try to stack more residuals and transfer learn next.

Akababa commented 6 years ago

About the supervised training: I trained with a total of 20-30,000 games from the FICS 2011-2016 database which ended in checkmate, although it's hard to say for sure because I was constantly changing the model and training weights, and fixed some bugs along the way.

ako1983 commented 6 years ago

Can't we train it with just a data set from 1850-1900? Or with just one specific player, let say Andersson and let that play with another model trained by let say Zukretorrt and see the results, In this way we can move along with history of chess,

Sent from my Verizon 4G LTE Droid On Dec 24, 2017 11:51 AM, Michael Pang notifications@github.com wrote:

About the supervised training: I trained with a total of 20-30,000 games from the FICS 2011-2016 database which ended in checkmate, although it's hard to say for sure because I was constantly changing the model and training weights, and fixed some bugs along the way.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://secure-web.cisco.com/11N2TX0YxU3_06mOcunicWRMoEX2hmW-6xsfZeHKVWDVAW8SjTrxN_V0V80LCV36hlOJur3gAadpPzfaAyzQkO6FCTvR26bMHxuzqySTWkv1TpyxLg1GJTIYisi7cspkPV4WfnEOe_jjvPhUzd3B7RKy5Uz6El56xlsw7z2LVxY7wh6Mu24vJEQuGjKGHN0urTZ-_oXnP1gQTNwLy0oz6Dzgs7ke4DezyDOG9AXOEWT4lx8DvTg8wdR-9NDbOlH1hQK_ys_iI3JRB2SzqQ9LFqZRVEqdoEnf__PrvzNFUMpe1Q-LoSiYN9Y2TMSVdXt1y998jmxCPaO8NQsiPLWGA8s9n51zjjrjEfQdBRZcI-Epsw9DBEgkt_iSPv9w_86X7IAxz0OPk8spURThElZqo9ElV9OaWSQrUFztZiPml9BSKZrtDPVphR60d4eb9Msx4/https%3A%2F%2Fgithub.com%2FZeta36%2Fchess-alpha-zero%2Fissues%2F37%23issuecomment-353793963, or mute the threadhttps://secure-web.cisco.com/15S03THmFaME8d87SJ7w8uufu40Mg1h93jsgpEA6TDx9taos9ezeRJj6XuP_sMsmJ5BuYOciS8ZuZ0MTDKhhcjfWlhHI6SFL3c0AsX4PRA7UMZdKGrm96nhALEz1D6-lzWTpC30myu8EmzVtIncC0OqSwq0Xg3ttivdD0wqehV5HWUJDtIg3SeBHLkNIGKXOvstbXOAMVFcdkaXF2Q29IPNF4s23dbeR0AM9YIznflGpHUcL7aJqiGjneIoft5xJL4wCRiZkyH2_dOFerbYxtzuMGrTVwmxVnxLv9LC7GDm60uNsdvtxUEP9zbA4BB8RHmJR1gjHzpoEh8QVWR5kadr1QAu-KSrnVzGIyDuybG4zaHXgIB5EAyK0tkuXMRtWvFXd8i1i5ETd__dsycSxAksh38DycrYDOHhXQ63KJW7jx0RbdNwIFW877ckRKHX7a/https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAJmd3IVT-rEdoBNcwzuyPeEdF2Q7zEm8ks5tDoEJgaJpZM4RLSYA.

Zeta36 commented 6 years ago

@Akababa, I merged into master your branch. Can you please check all is working right checking out from master? I cleaned up a little your code and I'd like you to check 'uci', 'self' and 'sl' workers are working normally in your local machine.

Thank you!!

Akababa commented 6 years ago

Awesome! I'll do that asap. I assume you got it working on your machine?

Zeta36 commented 6 years ago

yes, but I want you please to review it. And please if you have better weights upload them :).

Akababa commented 6 years ago

apronusdiagram1514178207

Lol, I just lost (as black). It's mostly my fault but still not a bad sign I guess.

Zeta36 commented 6 years ago

@Akababa, white plays very well. It's the weights of the model of that game the one you uploaded on the repo or it's another version more advanced?

Akababa commented 6 years ago

It's the same one except with virtual loss set to 2. I still wasn't able to improve on that with my 2 GB GPU :(