Unwinnability Detector (adjudicating timeouts is now possible)

miguel-ambrona commented 3 years ago

Deciding whether a game should be drawn after a timeout is a fundamental problem which chess servers could only solve partially.

I propose a solution: https://github.com/miguel-ambrona/D3-Chess

I have analyzed the full Lichess database of standard rated games with the above tool, identifying ALL the games that have been unfairly classified. The tool can handle about 3000 games per second (running on 1 core) and I estimate that Lichess only needs about 12 games per second.

Do you think the above numbers are good enough for considering the integration of this tool into Lichess?

Relevant links: https://github.com/ornicar/lila/issues/4640 https://chasolver.org

ornicar commented 3 years ago

I selected a random game from the "unfair" list: https://lichess.org/pHeKcmmM#126 and I actually agree with the current result, i.e. black wins.

niklasf commented 3 years ago

But there is indeed no sequence of moves that ends in a black win. (White is forced to capture the rook, and black cannot win from there.)

Nevsor commented 3 years ago

An interesting position from the "unfair" list: https://lichess.org/lob5aMiS#89

White has only a king, but got adjucated a win, because black left. If punishing a player leaving with a loss (even if timeout would have been a draw) is intended behaviour of the lichess server, then it might be helpful to filter games with one of the players leaving from the list.

I really like the idea of using this tool for better timeout adjucation (in the sense of following FIDE rules instead of the simpler "not enough material" rule most chess servers use). Maybe its even possible to run this tool on every draw offer? This would allow a player to effectively "claim" a draw by Article 9.6 of the FIDE Laws of Chess. Article 9.6 says that the game ends immediately in unwinnable positions (similar to mate), but searching for a helpmate on every move does not seem computationally feasible. Therefore claiming a draw by pressing the draw offer button seems like a good compromise.

miguel-ambrona commented 3 years ago

Hi @Nevsor . I agree with you, Lichess may want to punish a player for leaving the game, even if their opponent cannot win. But what if the player unintentionally lost connection?

All games in the unfair list are flagged with "Time forfeit" termination. I guess we could detect whether it was a real timeout by additionally checking the PGN final comment. What do others think about this?

With respect to immediately terminating games in case of dead draws, I guess this can actually be done with the current tool: About 35 games per second were finished in Lichess in the last month, say a game is on average 150 plies long (I think this is a safe upper-bound). In that case you would need to analyze at most 35 * 150 = 5250 positions per second. That is doable with one single core if the "quick" mode is used. (Maybe searching for dead draws only after the amount of material crosses a threshold or every n moves could make this analysis much lighter.)

benediktwerner commented 3 years ago

There's no need to do any messing around with PGNs. Lila is well aware whether a game was lost by flagging or leaving.

miguel-ambrona commented 3 years ago

There's no need to do any messing around with PGNs. Lila is well aware whether a game was lost by flagging or leaving.

I mean to filter out my list.

miguel-ambrona commented 3 years ago

After implementing a few ideas that I had, the performance of the quick routine has significantly improved: The quick version of the algorithm is as complete as before but it now takes 3 us on average per position and a maximum of 400 us. (The last version used to take about 140 us on average and a maximum of 6 ms.)

This means the tool can now can handle more than 300K positions per second in quick mode.

ornicar commented 3 years ago

I guess we could deploy a new service for it, with a public API, that lichess would also use.

dennisjaheruddin commented 3 years ago

Some thoughts on where to use the logic:

If there is a draw offer I would definitely want to check this. However, this is likely insufficient as I believe you are not allowed to offer a draw if you tried to offer it earlier in the game. (Of course it should be checked asynchronously so people cannot induce lag by clicking the draw button).
On game end in case of timeout: Again definitely should happen. This should be cheap enough that it won't impact the hardware requirements. (I guess its ok if people need to wait a ms longer sometimes to see the end of game result, I would be hesitant to first show the win and then later update it to a draw.)

I also feel that doing a full analysis every single move will likely be a bit heavy (given the comparative infrequency of the problem), so perhaps start by doing a full analysis when a draw is offered or a game ends. Essentially this will make sure no game ends with the wrong result. (At least as far as we can detect it).

The only problem I could foresee with NOT checking every move is that players may actually depend on something to be adjudicated a draw, let the clock run out and then hitting a corner case which is not picked up by the logic (though this may literally be 1 in a million games?). Though this cannot be avoided technically unless we reach 100% perfection, I think this can be simply addressed with some wording which includes a small disclaimer, e.g. "If Lichess recognizes the position is unwinnable, the game is drawn" opposed to something like "If the position is unwinnable, the game is drawn."

benediktwerner commented 3 years ago

You can offer a draw every 10 moves iirc

But I think for now the question of when to check can be left until later. We can just start with time-out games.

Although completeness is indeed something that needs to be considered. I'd say if we start doing this, we should really aim for total completeness if possible. It doesn't seem worth it to go through all this effort to adjudicate games correctly and then still miss some in the end. Although, if I understand this correctly, there have only been two games ever on Lichess that the quick version didn't judge correctly which if true I guess still seems acceptable, especially given the huge increase in performance. But if we can support the full version, I'd rather go for that.

Nevsor commented 3 years ago

How about doing a full analysis on timeout and a quick analysis on every move?

That would ensure, that nearly every game is automatically ended as soon as the FIDE rules say they should end. In the rare circumstances that the quick mode misses an unwinnable position the players would have to play on until there is either a timeout, a draw by other means or a position that the analyzer can prove to be unwinnable. The important point from my point of view is that no game which reached an unwinnable position will ever be adjucated as won or lost.

Using the numbers provided by miguel-ambrona this should be easily possible from a performance point of view:

Quick mode: 5250 positions per second * 3 μs per position = 15.75 ms of single-core CPU time per second.

Full mode (assuming that every game ends in a timeout): 35 positions per second * 340 μs per position = 11.90 ms of single-core CPU time per second.

Both of those estimates are quite conservative, but they do only look at the average case and there are probably times where many more games are played than on average.

miguel-ambrona commented 3 years ago

@Nevsor, using the quick mode during the game and the full mode only after timeouts is a nice idea! I think it is the way to go in the long term.

In the short term, as @benediktwerner suggest, having a check only after timeouts could be good, possibly only with the quick algorithm to see how it performs.

By the way, let me clarify that the reported times do not consider the time needed for parsing the position, i.e., converting the FEN string into the data structure for a position. A small test suggests that parsing takes also around 3 us, but we should perform more complete benchmarks.

anematode commented 2 years ago

@miguel-ambrona, presumably the code could be integrated in a way such that parsing the position is never needed? Though C++ and Scala interop seems a bit painful, so I'm wondering about how the code would perform if translated to Scala. My understanding is that it uses a lot of bitboards and thus popcount instructions and so would probably be significantly slower?

miguel-ambrona commented 2 years ago

@anematode, is what you suggest an alternative to the API idea? (By integrating the decision procedure inside the Lichess source?)

The latest "quick algorithm" is fairly simple and could be surely adapted to Scala. But, as you say, we may lose some performance, probably more than 3 us per position.

It could be better to write Scala bindings to the C++ library, but we would still need to convert the position between the two representations. How does Lichess represent positions?

Nevsor commented 2 years ago

My Scala knowledge is not the best, but it seems that Lichess uses a Map[Pos, Piece] to represent the piece positions. Should not be too hard to convert those to bitboards: https://github.com/ornicar/scalachess/blob/master/src/main/scala/variant/Standard.scala

Egroegw commented 2 years ago

What's the current status of this? A quick check every move and a full check on a timeout sounds reasonable.

niklasf commented 2 years ago

What communication mechanism should we use? stdin/stdout similar to bbpPairings should do. Could also create an HTTP API wrapper. Either of those are probably better than FFI or a Scala port.

miguel-ambrona commented 2 years ago

I could help with the implementation of either option. What would be necessary? Can you point to how bbpPairings (or other external services) are integrated into Lichess?

ornicar commented 2 years ago

https://github.com/lichess-org/lila/blob/8c21d7f869cf4cf6933f7cc640154fe09fa4a79b/modules/swiss/src/main/PairingSystem.scala#L27-L39

niklasf commented 2 years ago

Ahh ... mhh ... that doesn't quite look like I expected, since it's using temporary files and reading all output at once.

I assume this detector would do better as a long running process. The old modules/ai (deleted after we switched to fishnet) was an implementation of the UCI protocol. Maybe that's something to go by.

It set up the process like that: https://github.com/lichess-org/lila/blob/8de7364cb470593cc3b85f59408e9f41cc8892ed/modules/ai/src/main/Process.scala

And hooked it up to an actor: https://github.com/lichess-org/lila/blob/8de7364cb470593cc3b85f59408e9f41cc8892ed/modules/ai/src/main/ActorFSM.scala#L16-L19

rpdelaney commented 2 years ago

Would it be useful to have a modified tablebase that includes only those positions where random moves still draw against perfect play? For instance, if black runs out of time but the tablebase lookup shows that random moves by black draw against God, then the game is a draw.

I doubt that would catch every case we'd want to catch, but maybe it could reduce runtime resource consumption by doing some of the computation up front.

miguel-ambrona commented 2 years ago

What positions would you store in such tablebase?

rpdelaney commented 2 years ago

I suppose that I'd drop positions where white has any move that loses AND black has any move that loses. I've never gone plumbing into how tablebases are usually structured, but it would probably be simpler than existing EGTBs since all you need is a fast way to look up a position and find out which side(s) have a necessary draw.

miguel-ambrona commented 2 years ago

But we cannot store every single position (of interest) in a database. There are too many!

We could probably do it for all positions up to 7 pieces, but this won't be very helpful since most positions have more than 7 pieces. Also, the lookup time in such a table could be even take longer than the current quick algorithm.

rpdelaney commented 2 years ago

Yup, lookup would have to be faster than the general solution or there's no point. I suspect quite a bit more than 7 pieces would be do-able since a huge number of branches would get pruned, though.

rpdelaney commented 2 years ago

Also, tablebase lookup need not necessarily be faster than the quick solver. The critical advantage of a tablebase is that if a position is found in it, then the tb is absolutely authoritative about whether the position is a certain draw. So it seems more like an alternative approach to the slow solver.

I've been thinking about this more and it seems possible that it might actually be a complete solution for a slow solver. At some point, as more pieces are added to the board, the number of necessary draws would decrease rather than increase. So, unlike a conventional egtb, this one would have its complexity recede to a limit.

In other words, it's possible that with contemporary hardware, a tablebase could be generated and stored that has every necessarily drawn position. I don't know how to prove that besides doing it though.

miguel-ambrona commented 2 years ago

I think your idea is interesting, but I invite you to perform some calculations and realize how big such tablebase should be.

The critical advantage of a tablebase is that if a position is found in it, then the tb is absolutely authoritative about whether the position is a certain draw.

But how do you build the tablebase? With a solver, right? So it would be as good as the solver. The solvers that we propose (both quick and full) are sound, so they provide the same guarantees you mention.

At some point, as more pieces are added to the board, the number of necessary draws would decrease rather than increase.

I am not sure about this, the more pieces, the more combinations on parts of the board unrelated to the unwinnability motif.

In other words, it's possible that with contemporary hardware, a tablebase could be generated and stored that has every necessarily drawn position.

Assuming that the number of chess positions where White can't win is small enough so that they can be stored (if only this would be true), how could we generate them? We can't just enumerate all chess positions and filter them out, there are too many! Directly enumerating all unwinnable positions seems out of the scope too.

rpdelaney commented 2 years ago

I was thinking it would be generated similar to how EGTBs are generated, as I understand it: i.e., start at the game-end position and search backward, filtering as we go. A position that isn't a necessary draw is pruned from the search tree. Engines or other heuristics aren't used to generate EGTBs either, so I don't expect they would be needed there, or even helpful. In my head, it seems to me that having more pieces on the board means more ways to eventually lose, which means more branches getting pruned as we work backward.

But -- I think it's clear that we couldn't know for sure how big this draw-tablebase would get without actually generating it, and doing that implementation is a new, unorthodox thing that would take some significant work to even figure out how to do it (much like the approach you've taken). This is an open-source project, and nobody should be doing work that they don't find both interesting and promising, so I'll leave it there for now. You actually have a prototype, so that's what this issue should be about - I already feel too much time has been spent discussing my diversion.

kraktus commented 2 years ago

I’ve been working on helpmate tablebases for a few months now and make good progress despite limited time: https://github.com/kraktus/helpmate-tb

Knowing which positions are drawn in absolute is not enough because imo the FIDE rule takes the 50 (or at least 75)move rule into account, even if that is not very clear.

miguel-ambrona commented 2 years ago

@rpdelaney

I think it's clear that we couldn't know for sure how big this draw-tablebase would get without actually generating it.

We could try to estimate a lower-bound before actually going through the effort of implementing it.

I already feel too much time has been spent discussing my diversion.

It is worth exploring different options and I appreciate your comments, don't consider them a diversion. Your idea is great and brings a new perspective for addressing this problem, which is so hard and will only be completely solved by collaboration.

rpdelaney commented 2 years ago

Ok, fair enough.

We could try to estimate a lower-bound before actually going through the effort of implementing it.

I lack the technical knowledge of how to do this myself, but I'd love to see what others can come up with there. kraktus may have input also, if they have already been able to generate a prototype tb.

@kraktus:

Knowing which positions are drawn in absolute is not enough because imo the FIDE rule takes the 50 (or at least 75)move rule into account, even if that is not very clear.

syzygy EGTBs have a dtz attribute. Couldn't something similar be included here, so that consumers (such as lila) can determine if the 50/75 move rule makes a position a certain draw for the "losing" side of any game?

miguel-ambrona commented 2 years ago

If we really want to capture every position in the tablebase, we are gonna need to capture positions like this one, with extra (dark-squared white and/or light-squared black) bishops, up to 5 bishops per side (for legality). I count approximately 32^2 * (28 choose 5)^2 ways of positioning such 10 bishops which is roughly 10^13. Assuming we use Zobrist keys to index positions (which we may not, since avoiding collisions is crucial) the tablebase for only that pawn structure would require about 70 Terabytes. Now, multiply that by all possible blocked pawn-structures, it gets enormous. (And this is just for "blocked" positions, there are other types of unwinnable ones.)

Such count makes me think we should use the tablebase in combination with something else. Maybe some kind of baby-step-giant-step-like approach.

syzygy EGTBs have a dtz attribute. Couldn't something similar be included here

Yeah! We could ideally store the halfmove clock threshold that makes the difference between being winnable or unwinnable, for every position. For the 5-fold repetition rule things get more complicated though.

Anyway, the FIDE rules are ambiguous with respect to whether the 75 moves rule and the 5-fold repetition are involved in the rule for adjudicating timeouts. I lean towards the opinion that they should not be considered.

rpdelaney commented 2 years ago

Here's a good one from reddit: post

miguel-ambrona commented 2 years ago

Here's a good one from reddit: post

Nice! That game is from last month so it is already listed in the unfair games up to May 2022 (very close to the end of the file).

I have recently been working on improving the tool's logic. The goal was to capture new very exotic positions, but this also led to performance improvements for the application we are interested in: actual Lichess games.

At some point I would like to spend some time on integrating the tool based on what @niklasf suggested above. I could use some help with Scala though.

Daar543 commented 1 year ago

Just stumbled upon this - can the timeouts be actually adjudicated manually, by the losing player to actually asking a support for the game to be refunded (similar to a cheating opponent) ?

Kurt-von-Laven commented 5 months ago

That doesn't seem like a good use of precious moderator time. Personally I care a lot less how a timeout is adjudicated, provided that it's consistent, than that it is automated. Allowing human review opens the door to human error, the opponent being shocked to find the game result changed after the fact, inconsistent outcomes from identical final positions, and confusion regarding the timeout rules. The last one especially bothers me since players may correctly opt for quick but dubious moves in a time scramble based on the assumption that they can win by timeout as long as they keep sufficient mating material alive.

lichess-org / lila

Unwinnability Detector (adjudicating timeouts is now possible) #9249