openelections / clarify

Discover and parse results for jurisdictions that use Clarity-based election systems.
MIT License
38 stars 21 forks source link

Add CLI interface #18

Open ghing opened 7 years ago

ghing commented 7 years ago

While the original use case didn't want to consider how to download and unpack the XML results, I think it will make it a lot easier for volunteers to deal with counties or states that use Clarity systems for their results if they could just run a command to download the results as CSV and then use simpler scripts or manually update the data to do any post-processing.

@chagan and I started working on this at the #NICAR17 hackathon.

Tasks

ghing commented 7 years ago

@dwillis, I made this ticket to have a place to track the work that Chris and I have done so far and to reference once I send a pull request.

Should the output columns match, as much as possible, the ones listed in Common Fields? Or the columns listed in the data entry instructions?

Another option that Chris and I discussed would to be to have the output match that from elex so users of that tool could easily pull results from Clarity into a system designed around using elex to pull in AP results. However, our thoughts at the time were that this might be a later feature to add, if someone requests it, and instead prioritize whatever's easiest for Open Elections.

dwillis commented 7 years ago

@ghing my thoughts here are that the output should probably be closer to the columns in the data entry instructions, with the caveat that if there are additional columns they should be provided as well. I like the elex integration idea, but agree that it's a future thing.

ghing commented 7 years ago

Note to self: Originally @chagan and I wanted to keep this as simple as possible and envisioned having one command to rule them all that would download all results for all jurisdictions. However, this is both slow and difficult to integration test (because I'm finding only one jurisdiction has issues while the others are fine). I also wonder if grabbing anything is the most common use case. On election night, folks probably just want racewide and for people doing further analysis, running a command for each county, or scripting that separately isn't too onerous. I'm going to try making separate jurisdictions and results subcommands. Having subcommands seems worth it anyway to make things more extensible.

ghing commented 7 years ago

Just an update on this. I wrote almost all of the code to do this at the beginning of the month, but in testing for the use case, I ran into some errors for some of the counties. I'm slammed with other projects, at least for this week, but I'm hoping to circle back around to this at the end of the month or beginning of May.

ghing commented 7 years ago

A little background on the empty results:

clarify results http://results.enr.clarityelections.com/AR/Yell/63988/184121/Web01/en/summary.html

produces these results:

state,jurisdiction,office,candidate,party,votes
,,U.S. President & Vice President,,,
,Bluffton,U.S. President & Vice President,,,
,Briggsville,U.S. President & Vice President,,,
,Centerville,U.S. President & Vice President,,,
,Compton,U.S. President & Vice President,,,
,Crawford,U.S. President & Vice President,,,
,Danville,U.S. President & Vice President,,,
,Dardanelle Outside,U.S. President & Vice President,,,
,Dardanelle Ward 1,U.S. President & Vice President,,,
,Dardanelle Ward 2,U.S. President & Vice President,,,
,Dardanelle Ward 3 Other,U.S. President & Vice President,,,
,Dardanelle Ward 3 JP 10,U.S. President & Vice President,,,
,Dutch Creek,U.S. President & Vice President,,,
,Ferguson,U.S. President & Vice President,,,
,Galla Rock,U.S. President & Vice President,,,
,Gilkey,U.S. President & Vice President,,,
,Gravelly,U.S. President & Vice President,,,
,Herring,U.S. President & Vice President,,,
,Lamar,U.S. President & Vice President,,,
,Magazine 1,U.S. President & Vice President,,,
,Magazine 2,U.S. President & Vice President,,,
,Mason,U.S. President & Vice President,,,
,Mountain,U.S. President & Vice President,,,
,Prairie,U.S. President & Vice President,,,
,Richland,U.S. President & Vice President,,,
,Waveland,U.S. President & Vice President,,,
,Riley,U.S. President & Vice President,,,
,Rover,U.S. President & Vice President,,,
,Ward,U.S. President & Vice President,,,
,Ions Creek,U.S. President & Vice President,,,

I need to figure out why there are result objects that don't have any votes.

ghing commented 7 years ago

I added some logging to the Parser class and realized that my code in clarify.cli.results.result_as_dict was pulling the votes from Choice.total_votes which will be the total votes for the candidate (choice) for a particular type of vote and not the per-precinct votes.

Here's the code that doesn't work:

def result_as_dict(result, **addl_cols):
    """Return a result as a dictionary suitable for serialization"""
    result_dict = dict(**addl_cols)
    result_dict['office'] = result.contest.text
    # Cols:
    #  county, precinct, office, district, party, candidate, votes, winner (if
    #  it's in the data).

    if result.jurisdiction is not None:
        result_dict['jurisdiction'] = result.jurisdiction.name

    if result.choice is not None:
        result_dict['candidate'] = result.choice.text
        result_dict['party'] = result.choice.party
        result_dict['votes'] = result.choice.total_votes

    return result_dict

I'm going to get started on reworking this.

ghing commented 7 years ago

While figuring out the problem I reported in this comment, I realized that there are a bunch of fields in the XML that we could surface in the CSV.

Jurisdiction:

Contest:

We should figure out which of these should be included in the output CSV and how they should be presented.

My intuition is that we should:

@dwillis, do you have thoughts or questions?

dwillis commented 7 years ago

Let's include everything we can.

On Jul 2, 2017, at 12:22 PM, Geoffrey Hing notifications@github.com wrote:

While figuring out the problem I reported in this comment, I realized that there are a bunch of fields in the XML that we could surface in the CSV.

Jurisdiction:

total voters ballots cast voter turnout percent reporting precincts participating precincts reported precincts reporting percent Contest:

of candidates you can vote for

is the contest a ballot question percent reporting precincts participating precincts reported of counties reporting

We should figure out which of these should be included in the output CSV and how they should be presented.

My intuition is that we should:

Include everything Repeat values for each result that's for a given contest, jurisdiction Leave values recorded at the county level (e.g. precincts reporting) blank for precinct results @dwillis, do you have thoughts or questions?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ghing commented 6 years ago

@GPHemsley I am so excited to see you giving some love to this project. I have some commits tucked away from when I was working on adding a CLI to this project. I wanted to check in and see if this was something you were also poking at. I'll try to merge in your improvements and just take stock of where I was at with adding a CLI.

GPHemsley commented 6 years ago

@ghing I was working my way towards this ticket, per @dwillis, but I don't have to if you have a WIP.

ghing commented 6 years ago

Let me merge your changes into my fork and see how far I got. I'm happy to hand this over to you, but just wanted to give you a heads up if it's a useful starting point.

On Thu, Sep 6, 2018, 8:49 PM Gordon P. Hemsley notifications@github.com wrote:

@ghing https://github.com/ghing I was working my way towards this ticket, per @dwillis https://github.com/dwillis, but I don't have to if you have a WIP.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openelections/clarify/issues/18#issuecomment-419294849, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGvVLA_21ZkH_PEu8vP8Utra4CAQ03Lks5uYdC6gaJpZM4MUzfm .

ghing commented 6 years ago

@GPHemsley I merged master into my feature branch over at https://github.com/ghing/clarify/tree/add-cli. Feel free to use that as a starting point for further work on this feature, and let me know if you have any questions. I'm happy to pass this off to you, but didn't want you to replicate effort.

nealmcb commented 3 years ago

Thanks for all your work folks. I'm wondering what the status of this CLI work is. It sounds like some things work at least?

For context, years ago I did my own interface for downloading Clarity ENR and saving snapshots in a database: https://github.com/nealmcb/snapshot-election-results

I just started porting it to python3 in a new branch, then decided to look at how this project was progressing.