openelections / openelections-data-pa

Pre-processed election results for Pennsylvania elections
21 stars 29 forks source link

2020 General Election Precinct Results #69

Closed dwillis closed 3 years ago

dwillis commented 3 years ago

Using Tabula, OCR or whatever method you can, parse precinct-level results for the following counties. Original sources are in the sources-pa repository.

The goal is to create a single CSV file for each county, with the following headers:

county, precinct, office, district, party, candidate, votes

If the county file also provides a breakdown of votes by method, include that using the following headers:

early_voting, election_day, provisional, absentee

Include the following offices:

The CSV files should be named 20201103__pa__general__{county}__precinct.csv. Here's an example finished file: https://github.com/openelections/openelections-data-pa/blob/master/2020/20200602__pa__primary__elk__precinct.csv.

mileswwatkins commented 3 years ago

Thanks as always for y'all's hard work here! A few initial issues I've experienced when using these precinct-level CSVs:

mileswwatkins commented 3 years ago

Perry County's New Buffalo is missing from the CSV, probably failed to parse from the PDF because its name is right before a page break:

Screen Shot 2021-02-18 at 13 33 44
mileswwatkins commented 3 years ago

Here's my last set of notes for today. Thank you again for the hard work!

dwillis commented 3 years ago

@mileswwatkins many thanks for these - we'll get on them.

mileswwatkins commented 3 years ago

Oh, and Erie County's 40001 - WAYNE TOWNSHIP has a trailing space in its precinct name

mileswwatkins commented 3 years ago

Blair County's CSV failed to extract the full/distinct precinct names from the ElectionWare PDFs.

Eg, Altoona Ward 2, Precinct 2 in the PDF appears as only Altoona Ward 2 in the CSV, and similar with Blair Township, District 3 becoming Blair Township, etc. (The county has lots and lots of these numbered precincts, FWIW.)

dwillis commented 3 years ago

@mileswwatkins Ok, I believe all of these issues have been resolved.

mileswwatkins commented 3 years ago

Thank you so much, @dwillis! Just finished another round of QA, including comparing county vote totals against AP/Edison, and everything looks great.

The largest discrepancy is that Beaver County is a couple thousand votes short (y'all have an older unofficial-results PDF instead of the final/official results currently on their site), but doesn't affect my use case :)

dwillis commented 3 years ago

@mileswwatkins awesome, thanks for letting me know. We've updated Beaver.