Closed trtm closed 4 years ago
Hey trtm, thanks for your note.
If the request fails during scrape, we do fall back to the Wayback Machine (https://github.com/CurationCorp/curation-corpus/blob/master/web_scraper.py#L17). I'm reluctant to use it for every request because I don't want to hammer them.
One way to keep the dataset consistent over time would be to check (or create a new entry) at the Wayback Machine of the Web Archive ( https://web.archive.org/ )