thisisparker / xword-dl

⬛⬜⬛ Command line tool to scrape crosswords from online solvers and save them as .puz files ⬛⬜⬛
MIT License
140 stars 30 forks source link

Issue with USA Today puzzles. #122

Closed ChristianVaughn closed 9 months ago

ChristianVaughn commented 10 months ago

As of 8/29/23 I have been running into issues trying to download the daily puzzle. Tried with both xword-dl usa --latest and xword-dl usa -d 8/30/23 here is the error message. I haven't had the time to do any research into what is causing this. the last working date i can download without error is 8/28/23

Traceback (most recent call last):
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\Scripts\xword-dl.exe\__main__.py", line 7, in <module>
  File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\xword_dl\xword_dl.py", line 233, in main
    puzzle, filename = by_keyword(args.source, **options)
  File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\xword_dl\xword_dl.py", line 45, in by_keyword
    puzzle = dl.download(puzzle_url)
  File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\xword_dl\downloader\basedownloader.py", line 96, in download
    puzzle = self.parse_xword(xword_data)
  File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\xword_dl\downloader\amuniversaldownloader.py", line 64, in parse_xword
    puzzle.width = int(xword_data.get('Width'))
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
thisisparker commented 10 months ago

Yeah, this is weird, right? It looks like the JSON files for today and yesterday's puzzles are missing some key metadata fields. It's easy enough to work around the missing height and width (and I actually wrote a little patch to do that) but the files are also missing the title and author fields. Strangely enough, it looks like it just applies to those two days.

I'm inclined to chalk this up to some hiccup in their production process, and while I think this should fail a little more gracefully, I think it's OK for xword-dl to not be able to grab a handful of malformed puzzles. But I'll keep this issue open for a week or two and see if it comes up again. Worst case, there are alternative routes to the puzzle that I can explore.