thisisparker / xword-dl

⬛⬜⬛ Command line tool to scrape crosswords from online solvers and save them as .puz files ⬛⬜⬛
MIT License
139 stars 30 forks source link

Error parsing 2023-12-17 NY Times crossword #154

Closed rjmorris closed 5 months ago

rjmorris commented 6 months ago

Using the latest release of xword-dl (2023.12.2), I tried to download and parse today's NY Times crossword (2023-12-17), but I got the following error:

Traceback (most recent call last):
  File ".../xword/bin/xword-dl", line 8, in <module>
    sys.exit(main())
  File ".../xword/lib/python3.8/site-packages/xword_dl/xword_dl.py", line 233, in main
    puzzle, filename = by_keyword(args.source, **options)
  File ".../xword/lib/python3.8/site-packages/xword_dl/xword_dl.py", line 45, in by_keyword
    puzzle = dl.download(puzzle_url)
  File ".../xword/lib/python3.8/site-packages/xword_dl/downloader/basedownloader.py", line 96, in download
    puzzle = self.parse_xword(xword_data)
  File ".../xword/lib/python3.8/site-packages/xword_dl/downloader/newyorktimesdownloader.py", line 142, in parse_xword
    elif square and len(square['answer']) == 1:
KeyError: 'answer'

I didn't have any problems with the previous two days' puzzles.

I pulled the puzzle's JSON from the NY Times site and looked for the source of the error. I found 7 squares that would have caused the exception. Here's one example:

{
    "clues": [6, 68],
    "moreAnswers": {
        "valid": ["BRIDGE", "B"]
    },
    "type": 3
},

I'm guessing this is related to rebus answers. It looks like the source code doesn't anticipate a moreAnswers key here, instead expecting an answer key. For comparison, here's another square in the JSON that doesn't raise an exception:

{
    "answer": "M",
    "clues": [6, 67],
    "type": 1
},

I tried to revise the code to handle the moreAnswers case, but all my attempts led to an invalid .puz file (as reported by cursewords when I tried to open my .puz), and I didn't try to investigate further.

I'd rather not post the puzzle's JSON here since it's behind a paywall, but if there's anything else I can provide to help, let me know.

thisisparker commented 5 months ago

I think this should be fixed in the repo and I'll ship a version soon that includes it! The issue has to do with squares that have an officially blank solution, which .puz doesn't support, but I've got a workaround in place.

rjmorris commented 5 months ago

Great! Thanks for your work on this project!