GameTechExplained / Chess-Challenge

49 stars 2 forks source link

Strength of the submission? #1

Open cj5716 opened 11 months ago

cj5716 commented 11 months ago

Hi, I didn't intend to open an issue but rather a discussion, however it seems like that option is not available on this repo. I would just recommend you to measure the strength of your submission by playing against the expected winner of this tournament: https://github.com/analog-hors/Boychesser (by 100 elo to the current known second and third place, https://github.com/GediminasMasaitis/Chess-Challenge-Submission/tree/submission and https://github.com/Tyrant7/Chess-Challenge/)

They are also good resources to learn from!

btw, I would recommend you do your testing using cutechess as it allows concurrency (ie playing multiple games at once and utilising all threads available), as well as the fact that you can run SPRTs which run games to a meaningful sample size rather than 1000 games

GameTechExplained commented 11 months ago

Awesome, I'll check those out! Definitely much appreciated. Also, discussions are enabled now (I actually did not realize you had to manually enable them)

mattbruv commented 11 months ago

Hi @GameTechExplained, I found your video super entertaining and I can't wait to see how it does in the tournament. I wanted to play against your bot to test its skill. I feel like after playing it for a few hours I have a decent understanding of its skill level and strengths/weaknesses.

I know my review isn't as objective as having your engine compete against other engines like suggested above, but I think estimating its strength through the human lens is interesting and worth discussing too. For context, I'm rated 2,000-2100 on both chess.com and lichess in bullet/blitz time controls, so I'm at a level that's around expert but not master level.

It's actually quite an engaging engine to play. Normally it's never fun playing engines because they're either way too strong or way too weak, but oddly enough this engine is actually really entertaining to play against, probably because I'm punching right above my weight while playing it.

Overall I played roughly 20 games against it, and when I would play my moves instantly based on intuition I lost probably 80-90% of the time because it would outplay me tactically. However, when I slowed down a bit and put more thought into the game (~10 minute games) I found that I was often able to almost always get an advantageous position and eventually win by using its flaws against it. For example, Here is a video of me scoring 2 wins against it as white and black, each game about 7 minutes, and each game kind of highlights some of its weaknesses. Game 1, Game 2

If I had to compare its skill to that of a human, I would divide it by two categories: Tactics and positional play. The engine easily has the tactical awareness of players who are 2300-2400 level, i.e. master level players. It's very solid tactically.

Positionally speaking, it's consistently and noticeably much weaker. I would rate its positional understanding with that of a 1600-1800 rated player. Most of the engine's flaws are in its positional play. I would say that is its Achille's heel. It will be interesting to see how many of its losses in the tournament are due to not understanding positional play in chess vs. getting out-calculated.

Here are some noticeable problems it has:

If I had to summarize its play, I would say that it is very good tactically but its skill is hampered significantly because of its lack of positional understanding. It's also very timid, and has no concept of preparing its moves or planning a long term strategy. If there are no tactics, it will play super passively.

For any humans who want to play this engine, I think this strategy is the best approach:

  1. Aim to trade off pieces, especially queen trades to minimize the chance you blunder something tactically
  2. Play solid waiting moves and eventually it will make a glaringly bad pawn move or strategic misstep
  3. Slowly improve your position and over time it will continue to make obvious missteps and those will begin to compound
  4. By this point, you should have a very strong position, and now you should start seeing tactics appearing for you, or the path to winning should be obvious.

This strategy to beating the engine can pretty much be demonstrated in those two games, an advantage is slowly accrued to the point where it crumbles:

image image

I really enjoyed watching your video and playing it, and I look forward to seeing how it does overall and hear other people's opinions on its playstyle

cj5716 commented 11 months ago

Playstyle in this case is subject to the search and evaluation function. Btw, you have some bugs in your search discussed here: https://discord.com/channels/719576389245993010/1131978746346360873