stuartthomas25 / JParty

A party game which simulates the famous game show "Jeopardy!" using the J-Archive (http://j-archive.com).
https://www.stuartthomas.us/jparty/
GNU General Public License v3.0
23 stars 8 forks source link

[Enhancement] Scrape Wayback Machine instead of J-Archive directly #9

Closed benf2004 closed 1 year ago

benf2004 commented 1 year ago

I was thinking of a way to avoid scraping J-Archive, and I thought why not just scrape the Wayback Machine? Yes, using an archive to scrape another archive. I think all (or nearly all) of the games are already on there (I even hand-checked the most recent 10 episodes) so it wouldn't be much of a problem. That way the app is not breaking J-Archive terms & we can actually share it with more people. The nice thing about J-Archive the page layout/setup hasn't changed in years, so scraping different formats would not be an issue. We could build in a fallback that would scrape the J-Archive, but that wouldn't be necessary if we made sure all the recent games get scraped.

The one downside is that it would be a few seconds slower to load (exactly how long, we'll have to see), but I personally think the tradeoff is worth it.

It seems pretty easy to implement. I may take a look if I have time. Here's a useful tutorial: https://medium.com/analytics-vidhya/the-wayback-machine-scraper-63238f6abb66