thisisparker / xword-dl

⬛⬜⬛ Command line tool to scrape crosswords from online solvers and save them as .puz files ⬛⬜⬛
MIT License
146 stars 31 forks source link

Add downloader for Financial Times puzzles #180

Open afontenot opened 9 months ago

afontenot commented 9 months ago

This adds a downloader for the three Financial Times puzzles - the daily Cryptic, the weekly Polymath, and weekly Weekend puzzle.

This adds a new dependency on pycryptodome for AES decryption. I approached this in such a way that the dependency could easily be swapped out with a different one if preferable.

afontenot commented 9 months ago

I just want to say it's understandable if you don't want to add this one, either because of the added dependency or the deobfuscation involved. It was no great amount of work on my part, I just set it as my Sunday project (last Sunday), and I've been testing it every day since then to make sure it worked reliably.

Hopefully FT does not syndicate their crosswords from some other, more easily scraped source. I attempted to figure out if this was the case, but didn't see any evidence of it.

mixographer commented 4 months ago

I tried to test this change, but I just get an error stating that the keyword is unrecognized.

Keyword ftc not recognized.

afontenot commented 4 months ago

I tried to test this change, but I just get an error stating that the keyword is unrecognized.

Keyword ftc not recognized.

Could you provide reproduction steps? The following works for me:

git clone https://github.com/thisisparker/xword-dl
cd xword-dl
git fetch origin pull/180/head:ft
git checkout ft
python -m xword_dl ftc
mixographer commented 3 months ago

@afontenot you are correct, this does work for me, I was doing it wrong.

mixographer commented 2 weeks ago

This was working for me, but today it seems to fail:

File "/opt/homebrew/lib/python3.11/site-packages/xword_dl-2024.7.20-py3.11.egg/xword_dl/downloader/financialtimesdownloader.py", line 95, in <lambda> for _, clue in sorted(xword[direction].items(), key=lambda x: int(x[0])): ^^^^^^^^^ ValueError: invalid literal for int() with base 10: '21D'

afontenot commented 2 weeks ago

This was working for me, but today it seems to fail:

Can confirm, I'll look at it.

afontenot commented 2 weeks ago

So the issue is that there's a bogus clue, it's even in the web interface:

Screenshot_20241018_024629

The issue is that there are two 21 Down clues, and one of them is at the end with a 21D label instead of the usual clue number.

~Maybe this is some hack they've put in the puzzle format to allow one clue to reference another?~ Edit: it's not, there are other clues with references like 11 Down. I could make the code skip invalid clues but I'm not sure if that is really reasonable here. I'm not inclined to make a change like that if this is a genuine mistake in their puzzle.

mixographer commented 1 week ago

Ahh. Makes sense.