lichess-org / api

Lichess API documentation and examples
https://lichess.org/api
GNU Affero General Public License v3.0
429 stars 142 forks source link

export all studies of a user is missing newlines between chapters, studies, and at the end. #125

Closed jomega-github closed 3 years ago

jomega-github commented 3 years ago

Thanks for this new feature to the API. It works, but the simple try I made, writing the returned result to a file, produces a PGN file that is missing the usual newlines between chapters and studies and no newline at the end. This may cause problems for tools that read PGN files. Of course, I can parse the returned object and add the newlines, but it would be less work for anyone using this feature of the API if the newlines were included.

ornicar commented 3 years ago
[Event "Campionatul National Individual de Sah pentru Seniori 2021"]
[Site "Iasi - Hotel Moldova, Romania"]
[Date "2021.04.19"]
[Round "1.10"]
[White "Prisacaru, Stefan-Emilian"]
[Black "Nanu, Costica-Ciprian"]
[Result "0-1"]
[WhiteElo "1913"]
[BlackElo "2503"]
[TimeControl "90 min + 30 sec"]
[Annotator "https://lichess.org/@/broadcaster"]
[UTCDate "2021.06.30"]
[UTCTime "09:51:57"]
[Variant "Standard"]
[ECO "A00"]
[Opening "Hungarian Opening: Slav Formation"]

1. g3 { [%clk 1:27:05] } 1... d5 { [%clk 1:30:14] } ...  51... Rb1 { [%clk 1:05:11] } 0-1

[Event "Campionatul National Individual de Sah pentru Seniori 2021"]
[Site "Iasi - Hotel Moldova, Romania"]
[Date "2021.04.19"]
[Round "1.11"]
[White "Gavrilescu, David"]
[Black "Negrean, Andrei"]
[Result "1-0"]
[WhiteElo "2476"]
[BlackElo "1904"]
[TimeControl "90 min + 30 sec"]
[Annotator "https://lichess.org/@/broadcaster"]
[UTCDate "2021.06.30"]
[UTCTime "09:51:57"]
[Variant "Standard"]
[ECO "B90"]
[Opening "Sicilian Defense: Najdorf Variation"]

1. e4 { [%clk 1:30:54] } 1... c5 { [%clk 1:30:49] } ...  25. Qxg7# { [%clk 0:57:10] } 1-0

[Event "Campionatul National Individual de Sah pentru Seniori 2021"]
[Site "Iasi - Hotel Moldova, Romania"]
[...]

I just tried and the double newlines are present

jomega-github commented 3 years ago

ornicar: Thanks for the quick response. I'm guessing that the difference in result has something to do with the client you are using versus what I'm using, or my own misunderstanding of how to handle the returned Python map object. I'm using the Python client named berserk; which is the one listed on the API doc page. That client has not been updated in 15 months. I'm a relatively new Python programmer, but before retiring I was a programmer. Because berserk had not been updated for the new functionality of getting all the studies of a user, I added the method to do that. Doing that was easy because it is almost the same code as the existing method to get all chapters. Single stepping in my IDE debugger I do indeed see the double newlines at the point the data first comes back from the "get'. Something must be removing those later.

Because I'm also relatively new to github, I was not comfortable with making a fork of berserk. Maybe I should try that? In case you are interested, here is the method I added to bersker's clients.py. Added right after the existing "export" method in clients.py at the end of that file.

def export_all_pgn(self, username):
    """Export all chapters of a user.

    :return: the studies of the user
    :rtype: list
    """
    path = f'/study/by/{username}/export.pgn'
    return self._r.get(path, fmt=PGN, stream=True)
jomega-github commented 3 years ago

I found the problem. It is in the berserk Python client. The method parse_stream() in formats.py has two places where it does this

yield '\n'.join(lines).strip()

In the current version of Python, strip() with no arguments removes leading and trailing whitespace; and a newline is considered whitespace. I don't know why berserk is calling strip at all. Shouldn't berserk assume that the PGN is correct? The result of the code as written is that all newlines are stripped off the end of the returned string. Hence, the caller cannot tell how many newlines were there before this strip. And the end result is that the chapters and studies will not have newlines between them and there will be no newline at the very end.

I modified my copy of berserk to change the two places with this above code snippet to instead be

yield '\n'.join(lines) + '\n'

I reran my test, which test a user who has exactly 2 public studies, and it now correctly puts in the newlines.

Dboingue commented 3 years ago

That means you also compiled the client? or is it python itself and not compiled. maybe linking this issue in a new issue over at berserk repo. would eventually get some good Samaritan, see the code "en passant", toward another patch or pull request. Worth leaving it there. at is does not seem completely closed from bersek point of view. but maybe already done.

I think one needs to be a collaborator on both repository for referencing issues, though this UI. otherwise manually copying link in the triple dot menu. and create a new issue over there. would do it. should I ? as prospective future user of berserk, and even more new at python. (a bit less at github interactive, maybe).

jomega-github commented 3 years ago

That means you also compiled the client? or is it python itself and not compiled. maybe linking this issue in a new issue over at berserk repo. would eventually get some good Samaritan, see the code "en passant", toward another patch or pull request. Worth leaving it there. at is does not seem completely closed from bersek point of view. but maybe already done.

I think one needs to be a collaborator on both repository for referencing issues, though this UI. otherwise manually copying link in the triple dot menu. and create a new issue over there. would do it. should I ? as prospective future user of berserk, and even more new at python. (a bit less at github interactive, maybe).

I cloned the berserk repository to my local machine, made the changes, and then ran my test case. It is Python; which is an interpreted language that does not require "compilation" in the sense I think you are thinking.