Repetition opcode not respected in EPD boards

rpdelaney commented 6 years ago

Per PGN reference: http://www.thechessdrum.net/PGN_Reference.txt

16.2.5.20: Opcode "rc": repetition count

The "rc" opcode is used to indicate the number of occurrences of the indicated
position.  It takes a single, positive integer operand.  Any position,
including the initial starting position, is considered to have an "rc" value of
at least one.  A value of three indicates a candidate for a draw claim by the
position repetition rule.

It appears that this opcode is accepted by python-chess, but the created board does not report as expected with board.can_claim_threefold_repetition():

>>> import chess
>>> board = chess.Board()
>>> board.set_epd('3r2k1/p2r2p1/1p1B2Pp/4PQ1P/2b1p3/P3P3/7K/8 w - - bm e6 ; ce 132; acs1; acn 122623; rc 5')
{'bm': [Move.from_uci('e5e6')], 'ce': 132, 'acs1': None, 'acn': 122623, 'rc': 5}
>>> board.can_claim_threefold_repetition()
False
>>>

Looking at the set_epd() code in python-chess, it seems the only supported opcodes are "hmvc" and "fmvn"[1], because the given EPD is translated into a FEN, which is then passed to set_fen() to actually create and return a Board. That means that set_epd() in principle can't support any part of the EPD specification that is not also supported by FEN, unless it is refactored to create its own Board without calling set_fen(). Additionally, there is no way to retrieve any of the given opcodes from the Board once it is created, as they are not stored anywhere.

can_claim_threefold_repetition() checks for threefold repetition by working through the move stack and counting transpositions.[2] But an EPD does not provide a move stack. There is no property of a chess.Board that corresponds to the EPD 'rc' opcode that I can find.

I'm not sure if this qualifies as a bug or a feature request. But it seems python-chess has an incomplete implementation of EPD.

rpdelaney commented 6 years ago

What if, in addition to returning a dictionary of opcodes in the EPD, we also stored the dictionary in some board property like Board.opcodes? That would expose the opcode to can_claim_threefold_repetition(), or any other function that needs to check for EPD opcodes to behave correctly. e.g. I'm thinking is_game_over() should check for the 'resign', 'draw_accept', and maybe 'draw_claim' opcodes. I assume there are more.

niklasf commented 6 years ago

Unfortunately the other modules uci, pgn, syzygy, gaviota can't transparently support this. Some methods supporting rc and others not supporting it is probably even more unexpected than no specific support.

I was not so happy about hmvc and fmvn, but I guess it's simple enough and universally supported (in FENs) to justify.

rpdelaney commented 6 years ago

Unfortunately the other modules uci, pgn, syzygy, gaviota can't transparently support this.

That sounds like a challenge. :) What kinds of problems do you anticipate in implementing it there?

niklasf commented 6 years ago

PGN: There is no way to communicate the rc. It will be lost when writing to PGN and reading again (in python-chess or other software).
UCI: There's no way to communicate repetitions, aside from concrete moves.
Tablebases: Always assume there haven't been any repetitions.

rpdelaney commented 6 years ago

PGN: There is no way to communicate the rc. It will be lost when writing to PGN and reading again (in python-chess or other software).

If I create a board with an EPD, add a bunch of moves to the stack, and then print the tree out as a PGN, I've lost the opcode data that was in the original position descriptor. That's true, and I can't think of a way around that either. But I wouldn't say that's unexpected, since that's a feature of the PGN specification, not python-chess. Analogously, one might say that a video muxer (e.g., say, ffmpeg) should not support audio streams at all because some movie formats don't support audio, so it would be an inconsistent implementation to support audio in those formats that do allow it. After all, someone might use ffmpeg to convert a movie with an audio stream into a lossy format that doesn't support audio, and if they converted it back again later, they would have lost the audio. But that's not unexpected at all: it's just a consequence of the limitations of the format.

Likewise, if someone using python-chess wants to avoid the "lossyness" of the PGN format, they need to get all their work done with the Game before they print to PGN. That's just how it is. I don't know why that should be confusing or surprising to anyone who understands the limitations of these specifications.

As a postscript to that, it seems like python-chess could still do a better job when creating an EPD from a node in a given game: for instance, it could calculate the "rc" field, or other such fields that require historical data (i.e. moves). I don't see any downside to that -- why wouldn't I want those fields to appear in the EPD from python-chess? If I don't expect them then they'll just be ignored. Nothing surprising about that.

UCI: There's no way to communicate repetitions, aside from concrete moves.

True, we can't do this without changing the UCI standard. I can't think of much use for this anyway, but there's no way around it. Anyway, it's already the case that python-chess handles this in a way that might be surprising, since python-chess doesn't support the opcodes in the first place; so the result is completely the same. The only difference is which answer you can give when someone posts an issue here complaining that the engine didn't detect the threefold repetition or whatever: either "UCI doesn't support it, so there's nothing we can do" or "python-chess doesn't support it because... we don't want to." :) The point is, the user is getting surprised no matter what we do here.

Tablebases: Always assume there haven't been any repetitions.

Also true, but I'm not sure what the issue is there since an EPD from python-chess wouldn't include the (missing) opcodes anyway, as it is now, because python-chess doesn't support this at all. It's basically the same situation as with the UCI protocol. If python-chess did support the opcodes then I guess the onus would be shifted to whoever designs the next tablebase format (not that they would support repetitions, since they surely won't).

rpdelaney commented 6 years ago

tl;dr: There's no reason that someone who understands the limitations of these specifications (PGN, tablebase formats) and protocols (UCI) should be surprised that EPD opcodes appear and are interpreted correctly in some situations and not in others. That's just a fact of life when transforming data from one format to another, so this should come as no surprise to anyone knowledgeable about the formats they're using.

niklasf commented 6 years ago

Fair points. My other concern would be that many elements of the spec are extremely rarely used or outdated. For example I've never seen a program that handles rc, which limits its usefulness a lot. Since you're asking for it, I suppose there's at least one program that deals with it, though :)

Maybe an elegant implementation could convince me, but for now I still think it isn't worth any kind of complexity.

Regarding output, if implemented the field should be easy to add, similar to fmvn and hmvc, but I don't like always adding it automatically.

rpdelaney commented 6 years ago

Since you're asking for it, I suppose there's at least one program that deals with it, though :)

Lol. To be fair, I agree that EPD is not even remotely as widely adopted as FEN, and even my situation is a corner case. I discovered this when writing tests for python-chess-annotator to make sure that a function that checks repetitions was doing so correctly: I assumed I could just define a position using an EPD and work with that, since it was a bit more parsimonious than creating a Board and building a move stack with repetitions in it. But since it didn't work, I just did it the (very slightly) harder way.

So yeah. While I think python-chess would be better with improved EPD support, I agree it's not at all "on fire" to get this fixed.

Maybe an elegant implementation could convince me, but for now I still think it isn't worth any kind of complexity.

Yeah, maybe not. Since I'm still trying to level up my python, maybe this would be a good project for me to try to fix it and get you something PR worthy. I agree the result would have to be maintainable. When I get some time I'll dig into the code and see if I can get any concrete ideas that might be workable.

Regarding output, if implemented the field should be easy to add, similar to fmvn and hmvc, but I don't like always adding it automatically.

Hm. Why not? Is it necessarily a performance issue to generate them? IIRC even the EPD spec says that unrecognized fields must be ignored by the interpreter. Outside of performance I can't think of how this could cause a problem, unless you think downstream applications might have bugs in them...?

ddugovic commented 6 years ago

Perhaps rc could be useful for puzzles where certain positions (not in the analysis tree, due to an incomplete move history) need to be avoided. But I wonder without adding moves to the history how/where one would represent "already-visited" FENs.

rpdelaney commented 6 years ago

To clarify, I'm suggesting that implementing more of the opcodes described in the PGN reference, not just 'rc', should be a long term, albeit low-priority, goal. Some of them don't make sense for us to implement since python-chess can't really do anything with them: I guess storing them in the Board might be enough. But others make a lot of sense to build in. Scanning the list, I can pick out a few:

"draw_accept": accept a draw offer [e.g. is_game_over() would return True]
"draw_claim": claim a draw (must include "sm" opcode) [same as above]
"fmvn": fullmove number (already implemented)
"hmvc": halfmove clock (already implemented)
"rc": repetition count [is_game_over(), is_threefold_repetition(), is_fivefold_repetition() etc could return True if there are sufficient repetitions based on this opcode and following moves]
"resign": game resignation [is_game_over() would return True, PGN would have the appropriate result field filled in]
"sm": supplied move [maybe add it to the stack automatically? Not sure, that could also cause issues possibly]
"v0": variation name (primary, also "v1" though "v9") [same as above, except with all the variations]

rpdelaney commented 4 years ago

RIP

niklasf / python-chess

Repetition opcode not respected in EPD boards #259