openelections / openelections-data-or

Pre-processed results for Oregon elections
MIT License
18 stars 17 forks source link

Verifying candidate vote checksums per county? #149

Open nk9 opened 7 years ago

nk9 commented 7 years ago

I think you were proved right that this is the right step of the process to verify some basic things about the data with a script. We've found all sorts of issues which would have been much harder to diagnose later on in the process.

One thing we still don't have, though, is a verification of the final vote tallies. I see that each election has a csv file, e.g. 20100518__or__primary.csv. I guess this data comes from the state? Is there already a script adding up all the county results by candidate and verifying it against this? And if not, should there be? :-)

One thing this might require, though, is normalization of candidate names.

nk9 commented 7 years ago

This code from the CA data project looks like it will do what I'm asking. It also does a few of the other checks (like verifying that every line in the file has the same county), but misses many of them. @charles-difazio

charles-difazio commented 7 years ago

Hey, glad that looks useful!

Feel free to drop it in here for your purposes. I'd also suggest setting up Travis CI so that any data changes and updates will get validated automatically.

In the medium-term, we should probably pull the validation code into its own library. I've been meaning to generalize it, but haven't yet gotten around to it (trying to get CA 2016 to pass all the tests, first).