kevinludwig / pgn-parser

Parse PGN files using peg.js
MIT License
36 stars 19 forks source link

Accented characters are ignored in comments #12

Closed ArneVogel closed 4 years ago

ArneVogel commented 4 years ago

The parser ignores comments with accented characters.

Example pgn:

[Event ""]
[Site ""]
[Date "????.??.??"]
[Round ""]
[White ""]
[Black ""]
[Result "*"]

1.e4 e5 2.f4 {A à E é I î O ô U ù Y} *

The comment is parsed as A E I O U Y.

kevinludwig commented 4 years ago

Thanks, will take a look

kevinludwig commented 4 years ago

HI @ArneVogel I added a test to try and reproduce your issue but was unable, see https://github.com/kevinludwig/pgn-parser/blob/master/test/test_grammar.js#L159. I also took your example snippet and ran it directly but the accented chars were parsed correctly:

$ ./src/pgn-parser.js ./example.pgn
[
  {
    "comments_above_header": null,
    "headers": [
      {
        "name": "Event",
        "value": ""
      },
      {
        "name": "Site",
        "value": ""
      },
      {
        "name": "Date",
        "value": "????.??.??"
      },
      {
        "name": "Round",
        "value": ""
      },
      {
        "name": "White",
        "value": ""
      },
      {
        "name": "Black",
        "value": ""
      },
      {
        "name": "Result",
        "value": "*"
      }
    ],
    "comments": null,
    "moves": [
      {
        "move_number": 1,
        "move": "e4",
        "comments": []
      },
      {
        "move": "e5",
        "comments": []
      },
      {
        "move_number": 2,
        "move": "f4",
        "comments": [
          {
            "text": "A à E é I î O ô U ù Y"
          }
        ]
      }
    ],
    "result": "*"
  }
]

Let me know if there is something about your example I am not understanding.

ArneVogel commented 4 years ago

Ah yes sorry. I missed a sanitization step in my code where the accented characters were removed. It is working now.