openelections / openelections-data-ut

Pre-processed election results for Utah
6 stars 13 forks source link

2020 General Election Precinct Results #41

Open dwillis opened 3 years ago

dwillis commented 3 years ago

If you want to work on a county, add a comment saying which one you'd like to work on or email openelections@gmail.com. You can either email us finished CSV files or submit a pull request, whatever is easiest.

The results files you'll be converting are in the Utah sources repository. Many of these are either PDFs or Excel files. For electronic PDFs, we recommend using Tabula, which is free, to extract data. The goal is to create a single CSV file for each county, with the following headers:

county, precinct, office, district, party, candidate, votes

For the following offices: Registered Voters, Ballots Cast, President, U.S. House, Governor, Attorney General, State Auditor,, State Treasurer, State Senate, State House

File names should be: 20201103__ut__general__{county_name}__precinct.csv, with the county name lower case and spaces replaced with underscores.

bbrewington commented 3 years ago

Starting on Cache

andysylvester commented 3 years ago

I will take Beaver County

mileswwatkins commented 3 years ago

Thank you all! In ingesting the statewide GE precinct file, I noticed a two counties where the candidate-name and party-name columns were incorrectly swapped:

Cache

Looks correct in this repo's single-county source file, but got swapped when appending the files maybe? It seems like the order of CSV headers/columns is one way in the county file:

county,precinct,office,district,party,candidate,votes,mail,provisional

and another way in the statewide file:

county,precinct,office,district,candidate,party,votes,mail,election_day,early_voting

Here's an excerpt from the statewide file:

county,precinct,office,district,candidate,party,votes,mail,election_day,early_voting
...
Cache,AMALGA,President,,,"BROCK PIERCE, KARLA BALLARD",0,,,
Cache,AMALGA,President,,,"KANYE WEST, MICHELLE TIDBALL",1,,,
Cache,AMALGA,President,,DEM,JOSEPH R. BIDEN,43,,,
Cache,AMALGA,President,,CON,DON BLANKENSHIP,4,,,
Cache,AMALGA,President,,LBT,JO JORGENSEN,3,,,

Weber

Columns are actually incorrectly labeled in the single-county source file:

county,precinct,office,district,party,candidate,votes
...
Weber,(Unincorporated) WNO001,President,,Kanye West / Michelle Tidball,,0
Weber,(Unincorporated) WNO002,President,,Kanye West / Michelle Tidball,,0
Weber,(Unincorporated) WNO003,President,,Kanye West / Michelle Tidball,,0
Weber,Farr West 001,President,,Joseph R. Biden,DEM,123
Weber,Farr West 002,President,,Joseph R. Biden,DEM,179
Weber,Farr West 003,President,,Joseph R. Biden,DEM,270
Weber,Farr West 004,President,,Joseph R. Biden,DEM,344
mileswwatkins commented 3 years ago

It also appears that, when parsing the Iron County results PDF, all the Cedar City precincts lost their numeric identifier. That means that all 25 precincts starting with CC all received the same exact precinct name (CC) in the CSV and are not differentiated.

The precinct names do appear to have the proper suffix in the source PDF, eg:

Screen Shot 2021-01-31 at 21 45 54 Screen Shot 2021-01-31 at 21 46 06
mileswwatkins commented 3 years ago

Wasatch County appears to have two successive issues: first, in the single-county CSV file, the precincts starting with Wasatch 32: seem to be corrupted (with the number after the colon incrementing each and every line of the CSV):

Wasatch,32,Constitutional Amendment F,,,FOR,86
Wasatch,32,Constitutional Amendment F,,,AGAINST,78
Wasatch,32,Constitutional Amendment G,,,FOR,87
Wasatch,32,Constitutional Amendment G,,,AGAINST,84
Wasatch,32:2,Registered Voters,,,,72
Wasatch,32:3,Ballots Cast,,,,69
Wasatch,32:5,President,,REP,DONALD J. TRUMP,61
Wasatch,32:6,President,,DEM,JOSEPH R. BIDEN,7
Wasatch,32:7,President,,,"JOE MCHUGH, ELIZABETH STORM",0
Wasatch,32:8,President,,,BROCK PIERCE,0
Wasatch,32:9,President,,CON,DON BLANKENSHIP,0
Wasatch,32:10,President,,GRN,HOWIE HAWKINS,0
Wasatch,32:11,President,,LIB,JO JORGENSEN,0
Wasatch,32:12,President,,,"KANYE WEST, MICHELLE TIDBALL",0
Wasatch,32:13,President,,,GLORIA LA RIVA,0
Wasatch,32:14,President,,,Write-In: Jade Simmons,0
Wasatch,32:15,President,,,Write-In: Brian Carroll,0

This then seems to be interpreted as times of day by the script that generated the statewide file, creating wonky precinct names:

Wasatch,32,State Treasurer,,RICHARD PROCTOR,CON,16,,,
Wasatch,32,State Senate,27,DAVID PARLEY HINKINS,REP,140,,,
Wasatch,32,State House,54,MIKE KOHLER,REP,150,,,
Wasatch,32,State House,54,MEAGHAN MILLER,DEM,30,,,
Wasatch,32:02:00,Registered Voters,,,,72,,,
Wasatch,32:03:00,Ballots Cast,,,,69,,,
Wasatch,32:05:00,President,,DONALD J. TRUMP,REP,61,,,
Wasatch,32:06:00,President,,JOSEPH R. BIDEN,DEM,7,,,
Wasatch,32:07:00,President,,"JOE MCHUGH, ELIZABETH STORM",,0,,,
Wasatch,32:08:00,President,,BROCK PIERCE,,0,,,
Wasatch,32:09:00,President,,DON BLANKENSHIP,CON,0,,,
Wasatch,32:10:00,President,,HOWIE HAWKINS,GRN,0,,,
Wasatch,32:11:00,President,,JO JORGENSEN,LIB,0,,,
Wasatch,32:12:00,President,,"KANYE WEST, MICHELLE TIDBALL",,0,,,
Wasatch,32:13:00,President,,GLORIA LA RIVA,,0,,,
Wasatch,32:14:00,President,,Write-In: Jade Simmons,,0,,,
Wasatch,32:15:00,President,,Write-In: Brian Carroll,,0,,,

but this time-of-day interpretation also seems to have happened for all other Wasatch County precincts with : in their names, such as 31:1:

Wasatch,31,State Treasurer,,RICHARD PROCTOR,CON,85,,,
Wasatch,31,State Treasurer,,JOSEPH SPECIALE,LIB,57,,,
Wasatch,31,State House,54,MIKE KOHLER,REP,676,,,
Wasatch,31,State House,54,MEAGHAN MILLER,DEM,173,,,
Wasatch,31:01:00,Registered Voters,,,,50,,,
Wasatch,31:01:00,Ballots Cast,,,,45,,,
Wasatch,31:01:00,President,,DONALD J. TRUMP,REP,36,,,
Wasatch,31:01:00,President,,JOSEPH R. BIDEN,DEM,9,,,
Wasatch,31:01:00,President,,"JOE MCHUGH, ELIZABETH STORM",,0,,,
Wasatch,31:01:00,President,,BROCK PIERCE,,0,,,
Wasatch,31:01:00,President,,DON BLANKENSHIP,CON,0,,,

(The county's source PDF looks fine/correct.)

mileswwatkins commented 3 years ago

Update: Aha, this seems to occur whenever/wherever a candidate (in this case, Trump) receives more than 1000 votes in these two counties, when using this PDF parsing technique for Electionware PDFs. See also Washington County's SG44 and WA68, to name a couple. Probably due to the thousands-place comma. Cross-referencing official election results, I don't see this happening to Biden at all, and so presumably no other candidates either.


Parsing of the Juab County PDF seems to have split the MO05 precinct into two: MO05 (containing presidential results for West, Pierce, Biden, and Blanekship) and MOHR (containing Jorgensen, McHugh, Hawkins, and Trump).

This same issue seems to have happened in Washington County: SC71 was split into two (with the second half named MOHR):

Washington,SC71,Ballots Cast,,,,1297
Washington,SC71,Ballots Cast Blank,,,,0
Washington,SC71,President,,,"BROCK PIERCE, KARLA BALLARD",0
Washington,SC71,President,,,"KANYE WEST, MICHELLE TIDBALL",5
Washington,SC71,President,,DEM,JOSEPH R. BIDEN,181
Washington,SC71,President,,CON,DON BLANKENSHIP,3
Washington,MOHR,President,,LIB,JO JORGENSEN,26
Washington,MOHR,President,,,"JOE MCHUGH, ELIZABETH STORM",1
Washington,MOHR,President,,GRN,HOWIE HAWKINS,5
Washington,MOHR,President,,,GLORIA LA RIVA,0
Washington,MOHR,President,,REP,DONALD J. TRUMP,1056
Washington,MOHR,President,,,Write-In Totals,17
Washington,MOHR,President,,,Write-In: Un-Registered Write In,16
Washington,MOHR,President,,,Write-In: BRIAN CARROLL,0
Washington,MOHR,President,,,Write-In: JADE SIMMONS,0
Washington,MOHR,U.S. House,2,DEM,KAEL WESTON,138
Washington,MOHR,U.S. House,2,REP,CHRIS STEWART,1009
Washington,MOHR,U.S. House,2,LIB,J. ROBERT LATHAM,44
Washington,SC71,Governor,,DEM,CHRIS PETERSON,138
Washington,SC71,Governor,,IAP,GREG DUERDEN,19
Washington,SC71,Governor,,REP,SPENCER J. COX,992
Washington,SC71,Governor,,LIB,DANIEL RHEAD COTTAM,55
Washington,SC71,Governor,,,Write-In Totals,50

This might be because the name MOHR (a candidate's name) is being parsed as a precinct name?

dwillis commented 3 years ago

@mileswwatkins thanks for this - we'll have this fixed today.

andysylvester commented 3 years ago

I am close to finishing Beaver County. Their report lists normal and provisional ballots for each precinct. I have been adding the two together, should I keep them separate (have separate rows)?

dwillis commented 3 years ago

@andysylvester hey Andy, ideally you can add them together for the votes column and make separate columns for each of them, but whatever is easiest.

andysylvester commented 3 years ago

@dwillis I will go ahead and keep them together in the votes column, will plan to make a commit later today.

andysylvester commented 3 years ago

I was ready to make a commit just now for Beaver County, but saw that @dwillis made a commit two days ago. I am sorry if taking a week to create a file is too long, but this is the second time someone has completed a county I signed up for. This does not motivate me to keep signing up to help on this project. Can someone help me understand what is going on here?

dwillis commented 3 years ago

@andysylvester Hey Andy, first let me say thank you for your work - it is needed. This isn't on you; it's on me. When we have new folks come into the project, often we will have more experienced volunteers duplicate their work to help us check things (and clearly, from the comments here, that's needed). In this case we had another Beaver County done and I was eager to get a statewide results file out, so I apologize for making you feel like your work isn't valuable and valued. I hope you do stick around, and if this should happen again I will let you know asap, but mostly will try to avoid this.

andysylvester commented 3 years ago

Thanks for the explanation, I agree from the comment thread that data needs to be double-checked. Should I go ahead with my commit (I included data for what appeared to be write-in candidates), or let it go? Also, where should I go to sign up for another county, and to make sure people know I am signed up?

dwillis commented 3 years ago

@andysylvester sure - why don't you send it directly to me at openelections@gmail.com and I'll check it against what we have. In terms of another county, Utah is now finished for the general but California is wide open. https://github.com/openelections/openelections-data-ca/issues/144. I'll make sure the decks are cleared for you.