Closed iNtEgraIR2021 closed 2 years ago
Hi there! You're totally right on the AmuseLabs downloader, I just reorganized that code and missed that (and one other) which I'll fix right after this.
I think some outlets that use AmuseLabs enable a date-picker
and some do not, and if they don't you might have to approach it a little differently. If you're interested in building out a full McKinsey downloader, you might look to the New Yorker or WSJ files for some pointers on how I've done it. Basically it's just scraping a landing page like the one you've linked to and pulling out the link to the latest puzzle from there.
(If you did implement a find_latest
and wanted to take it one step further, you could reasonably use the date format in the URL to search by date. I'd prefer not to guess URLs for the latest puzzle, though, if it's possible to scrape a page, because imo find_latest
should always return a puzzle whereas search by date can exit unsuccessfully if there wasn't a puzzle published on that date.)
Sorry, meant to merge! merging now.
Hi @thisisparker 👋🏼
While trying to download crosswords[^1] from McKinsey's The McKinsey Crossword I noticed a possible bug in line 78 of
xword_dl.py
.Afterwards I managed to successfully download crosswords with:
I also tried to write a downloader class for The McKinsey Crossword however I failed to find the date picker page used by the downloaders for The Atlantic or The Daily Beast.
According to the help modal screen of the crossword UI McKinsey apparently uses the id mck at Amuselabs. Unfortunately using mck or mckinsey returned a 404 error when trying to access the date picker screen (
https://cdn3.amuselabs.com/ID/date-picker?set=ID
).Regards, Petra
I'm not affiliated with any of the organizations named above.
[^1]: using Amuselabs PuzzleMe