openelections / specs

Specs for OpenElections Data
9 stars 3 forks source link

Current specs for Results etc? #1

Open nealmcb opened 7 years ago

nealmcb commented 7 years ago

I'm writing software to process openelection data, e.g. to determine winners, calculate margins, and do other analyses and audits of data in the same format.

I'm finding it hard to find the openelections specs for results format.

I found an old blog post which points to Elections Data Spec Version 2 · openelections/specs Wiki. I tried to find the latest version in the openelections/specs: Specs for OpenElections Data repo with no luck until I noticed the links were to wiki entries. I looked more closely at the openelections/specs Wiki, which didn't list any more recent versions.

But the data I found at e.g. Florida 2016 general precinct results has different fields in it.

What are the current specs? Do all the results files use the same spec?

Thanks!

dwillis commented 7 years ago

Hi @nealmcb, thanks for the questions. The specs in the wiki are for two types of information: Elections and Results. The latter, version 2 of which is here, is what we're aiming for in the openelections-results-{state} repositories like the Florida one you referenced. The Election specs are for metadata about elections that we track.

There's one caveat to this: the files currently in the results repositories are considered "raw" in the sense that they are produced using the --raw option to our bake command. Raw results have a standard schema but have unprocessed values straight from the raw data files.

Does that help? It probably creates more questions, I realize.

nealmcb commented 7 years ago

Thanks for the quick response! And for the reminder about what "raw" means.

I was looking at the Results 2 spec, and there seem to be a number of differences.


FL file         Results v 2     type    description

updated_at
id              election_id     string  OpenElections-created slug for election
start_date
end_date
election_type
result_type
special
office          office_name     string  name of office sought
district        office_district string  district number or designation
name_raw        name            string  "raw" full name of the candidate from the results, if present
last_name       family_name     string  parsed family name of the candidate
first_name      given_name      string  parsed given name of the candidate
suffix          suffix          string  parsed suffix of the candidate
middle_name     additional_name string  parsed middle name or initial of the candidate
        other_names     array   array of additional names with note label
party           party           string  "raw" party name or abbreviation from the results (None for non-partisan)
parent_jurisdiction
jurisdiction
division        division        string  political jurisdiction using Open Civic Data Division Identifiers - acts as reporting level
votes           votes           integer number of votes received by the candidate within this division
votes_type
total_votes
winner          winner          boolean true for the winning candidate(s) within this division; all other candidates are marked as false
write_in        write_in        boolean true if the candidate is a write-in candidate 
year
                pct             decimal percentage of votes received by the candidate within this division```
dwillis commented 7 years ago

Very true. We haven't had much discussion about the specs in awhile, but it seems to me that we've gotten out of sync. I'll get an update to the spec on the list. In terms of what's in the raw results that's not in the spec, the most consequential ones are parent_jurisdiction, which is usually a state or county/city depending on the context, and jurisdiction, which is a county or precinct again depending on the geographic context. votes_type and total_votes are there in case the data breaks those out (say, election day votes vs. early voting).

nealmcb commented 7 years ago

Thanks. I suggest putting the resulting specs in the repo, vs in the wiki.

dwillis commented 7 years ago

Good idea.