niklasf / python-chess

A chess library for Python, with move generation and validation, PGN parsing and writing, Polyglot opening book reading, Gaviota tablebase probing, Syzygy tablebase probing, and UCI/XBoard engine communication
https://python-chess.readthedocs.io/en/latest/
GNU General Public License v3.0
2.4k stars 526 forks source link

Handle multiple comments on a move/variation/game #1068

Open MarkZH opened 7 months ago

MarkZH commented 7 months ago

Some chess programs that write PGN files (including lichess.org and others mentioned in the linked issue) will write multiple consecutive braced comments--e.g., 1. e4 { A very common opening } { Perhaps too common ... } { [%clk 0:00:01] }. Previously, when read by chess.pgn.read_game(), these comments would be joined into a single comment with a single space between them. This PR creates a new class to read, store, and write these comments while keeping them separate. The user interface is kept backwards compatible by using @property decorators to allow assignment and comparison of bare strings and string lists to game node comments.

Closes #946

niklasf commented 5 months ago

Hi. Good idea.

I think changing the property type in this way is not quite backwards compatible, even if __str__ is provided. Maybe something like

class GameNode:

    comments: List[Comment]

    @property
    def comment(self) -> str: ...

    @comment.setter
    def comment(self, value: str) -> None: ...

could work?

Or maybe it's also just time to consider 2.x at some point.

MarkZH commented 5 months ago

I tried to make GameNodeComment backwards compatible in terms of assignment and writing to a PGN file. The breaking change comes from getting comments from a node (with some allowances for comparing a single comment to a single string), since I didn't want users to accidentally miss comments. What would GameNode.comment return, just the first comment?

Saving this until 2.x would make things much easier. A new class wouldn't even be needed.

class GameNode:
    comments: list[str]

If you want to save this PR until 2.x, this would be an easy change.

niklasf commented 5 months ago

In 1.x, GameNode.comment would have to be the concatenation of all comments, to match the current behavior.

MarkZH commented 5 months ago

That should do it. GameNode.comments returns the GameNodeComment (a list[str] essentially), and GameNode.comment returns a single str formed from joining the strings with spaces.

MarkZH commented 4 months ago

Do you want to save this functionality for v2.0? I can make a new PR (not backwards compatible) that would be much simpler than this one--basically replacing all comment: str members of GameNode and ChildNode with comments: list[str] and changing the PGN comment functions.

niklasf commented 4 months ago

Let's do 2.x. So it would contain "small" breaking changes and clean-ups, and more ambitious plans and far-reaching changes would be pushed back to 3.x.

MarkZH commented 4 months ago

Updated changes:

  1. Rename GameNode comment field from comment to comments and starting_comment to starting_comments.
  2. Adapt methods that process comments (e.g., eval() and set_eval()).
  3. Adapt PGN reading and writing to handle moves/games with multiple comments.

The change from comment to comments is the breaking change. Assigning to node.comment no longer has any effect. I figured that using a plural name emphasizes that the field should be a list. Plus, assigning a string comment instead of a list of strings will cause problems with comment processing methods (Item 2, above).

MarkZH commented 2 months ago

Looks like the release of mypy 1.11 is catching new errors.

niklasf commented 2 months ago

Fixed via #1100. Targetting one final 1.x release before making the jump to 2.x.

MarkZH commented 2 months ago

Should extra whitespace be preserved when reading comments? To ask another way, what was the reason for this commit? I would think that a newline in the middle of a curly brace comment was put there by word wrapping and so would not be a significant part of the comment. Is there a problem with formatting imported comments with the equivalent of

comment = " ".join(comment.split())

to replace all whitespace with single spaces?

MarkZH commented 3 weeks ago

I'm thinking of adding error messages for when users try to use the now-deleted GameNode.comment or ChildNode.comment instead of the new GameNode.comments or ChildNode.comments. Something like this:

    _comment_error = AttributeError("GameNode.comment no longer exists. Use GameNode.comments to store a list of str comments.")

    @property
    def comment(self) -> None:
        raise GameNode._comment_error

    @comment.setter
    def comment(self, value: Any) -> None:
        raise GameNode._comment_error

    # And similarly for starting_comment

I'm thinking this could help catch errors in user code that would break when updating to v2.0.0. Then again, it would mask type checking errors if users tried to read or write to node.comment since that field would now exist. Thoughts?

niklasf commented 3 weeks ago

It's an ugly hack, but maybe we can declare the methods only if not typing.TYPE_CHECKING:?

MarkZH commented 3 weeks ago

I don't think one ugly hack can cancel out another ugly hack. I just tried type checking a simple example:

class move:
    def __init__(self) -> None:
        self.comments: list[str] = []

m = move()
m.comments = "asdf"
m.comment = "dkfjdl"

Mypy came up with the following:

test.py:6: error: Incompatible types in assignment (expression has type "str", variable has type "list[str]")  [assignment]
test.py:7: error: "move" has no attribute "comment"; maybe "comments"?  [attr-defined]
Found 2 errors in 1 file (checked 1 source file)

So, users who type check with mypy should notice the change from node.comment to node.comments. Users who don't should see that changes they make to node.comment have no effect on the output. I think this means that the breaking change should be prominently reported in the v2.0 changelog--both the name change and type change.