openelections / openelections-data-ca

Pre-processed election results for California elections
MIT License
13 stars 17 forks source link

Convert 2020 General Election Precinct Results #144

Closed dwillis closed 2 years ago

dwillis commented 3 years ago

If you want to work on a county, add a comment saying which one you'd like to work on or email openelections@gmail.com. You can either email us finished CSV files or submit a pull request, whatever is easiest.

The results files you'll be converting are in the California sources repository. Many of these are either PDFs or Excel files. For electronic PDFs, we recommend using Tabula, which is free, to extract data. The goal is to create a single CSV file for each county, with the following headers:

county, precinct, office, district, party, candidate, votes

If the county file also provides a breakdown of votes by method, include those using the following headers:

early_voting, election_day, provisional, mail

For the following offices: President, U.S. House, and any State Senate or State Assembly races. File names should be: 20201103__ca__general__{county_name}__precinct.csv, with the county name lower case and spaces replaced with underscores.

An example file looks like this

aidanconnolly commented 3 years ago

I'll start on Alameda.

andysylvester commented 3 years ago

I will take Alpine County.

goodasha commented 3 years ago

Added Lake general #147.

goodasha commented 3 years ago

I will take Mendocino.

aidanconnolly commented 3 years ago

I've got Placer.

carbonphyber commented 3 years ago

I took a stab at Santa Clara #148 . If any other counties use ClarityElections.com to store/display elections data, please have a look at my PR.

goodasha commented 3 years ago

I will take Tulare.

omarcoming commented 3 years ago

I will take Colusa.

Edit: I will also do Contra Costa and Kern counties as their reporting format is the same for Colusa.

omarcoming commented 3 years ago

How should we deal with precincts where vote tallies have been withheld by the Elections Dept due to the small precinct / low turnout resulting in issues with protecting voter privacy?

For example, Contra Costa county has several precincts where the In-Person tallies are censored but we could easily infer the proper counts by subtracting the Vote By Mail portion from the Total. Should I be including that inferred data?

The other issue I'm running in to is with precincts that haven't reported either the Vote By Mail or In-Person tallies for any candidates and only include the totals. What is the standard for representing missing data on this project?

dwillis commented 3 years ago

@omarcoming good questions, thank you! We don't infer data, so try to represent things as they are reported. In practice, you can put N/A in the appropriate column for tallies that are withheld. For precincts that only have the totals, just list the total.

omarcoming commented 3 years ago

If the county file also provides a breakdown of votes by method, include those using the following headers:

early_voting, election_day, provisional, mail

What if the breakdown is just in-person vs mail?

dwillis commented 3 years ago

@omarcoming then just use mail and election_day, unless the in-person combines election day and early voting. In that case, do mail and in_person and we'll figure it out.

goodasha commented 3 years ago

I'll take Imperial

carbonphyber commented 3 years ago

I'll work on Madera.

dwillis commented 3 years ago

Hi @carbonphyber - I should have marked it, but someone else claimed Madera this week. Any chance you would entertain another?

carbonphyber commented 3 years ago

Hi @carbonphyber - I should have marked it, but someone else claimed Madera this week. Any chance you would entertain another?

I've already submitted a PR for Madera: #186

I'll work on San Luis Obispo next

carbonphyber commented 3 years ago

I'll work on Santa Barbara

carbonphyber commented 3 years ago

I'll work on Los Angeles.

goodasha commented 3 years ago

I'll take Imperial

This file has Absentee in addition to early voting, vote by mail and election day. How is absentee handled?

dwillis commented 3 years ago

I'll take Imperial

This file has Absentee in addition to early voting, vote by mail and election day. How is absentee handled?

Just add another column called absentee.

goodasha commented 3 years ago

I can take Tehama.

goodasha commented 3 years ago

I can give Contra Costa a try

dwillis commented 3 years ago

@goodasha that would be fabulous, thank you!

goodasha commented 3 years ago

I can take San Bernardino

dwillis commented 3 years ago

@goodasha thank you!

goodasha commented 3 years ago

@goodasha thank you!

Question: This file has Designated Mail Ballot in addition to Mail Ballot as a type of vote - how should that be labeled? And are Provisional votes labeled provisional in the final file?

dwillis commented 3 years ago

You can label it as designated_mail, and please do include provisional votes with the label 'provisional'

goodasha commented 3 years ago

I can take Kern

goodasha commented 3 years ago

I can take Kern.

goodasha commented 3 years ago

Feeling a little rusty on this. Does 'Mail Ballot' translate to 'mail', and 'Vote by Mail' map to ? 'absentee'?

dwillis commented 3 years ago

@goodasha Mail Ballot would be early_voting in this case, and Vote by Mail would be just mail. Thanks!