Closed divergentdave closed 10 years ago
This is fantastic, @divergentdave. Ping the thread whenever you think it's merge-ready.
Is this stuff worth merging in as is? I'll take all the fixes so far!
The two caveats I have right now are that this will start spewing warnings for the six remaining scrapers, and I want to go back and add some comments in the "remarks for IG webmaster" section. Otherwise, it should be ready.
I'm going to accept the warnings, in the interest of more working unique report IDs. I'll leave the branch and won't delete it, so you can refile a new pull request from the same branch if you continue your work here.
This branch is a WIP for correcting cases where the same report_id is used for two different reports that fall in the same year, and thus can't be caught by the QA scripts. The first two commits add validation, everything else is tweaks to scrapers. I've taken some of the easy ones already, but there is still plenty to do.
Changes to scrapers tend to be
Checklist of scrapers
Here's a copy of the list of duplicates I'm working from https://gist.github.com/divergentdave/d520271903ebf8f02776