Pull historical deck data from web archive

kphurley commented 4 months ago

Not sure yet how possible it is, but it's clear the community would like the historical decks from cardgamedb preserved.

I think we'll need to understand just how decks are going to be modeled in this system before tackling this, but I wanted to at least capture this use case and start thinking about it.

kphurley commented 4 months ago

Here's the most promising flow I've found so far:

Click on deck in archive
Click on export > bb code
- This appears analogous to doing document.querySelector('#exportTextArea').value- it appears this element is present when the deck renders, so only need to get the deck's page rendered

Once we have the bb code doc, we'll need to write a parser to determine the 10 objectives and then we'll at least have the deck's contents.

The title should be easy to scrape as well - it'll match the link clicked on to render the deck's page.

The author's comments are here: document.querySelector('#dStrat').textContent. Getting the metadata about the author and timestamp might be tricky. This worked, but it's probably the least reliable of the three: document.querySelector('.blend_links').textContent.

kphurley commented 3 months ago

The site doesn't really load a lot of stuff correctly anymore so this might be impossible now. We could see if theres a webarchive version we can pull this from though.

kphurley / swlcgdb

Pull historical deck data from web archive #22