BrightSpots / rcv

Ranked Choice Voting Universal Tabulator
Mozilla Public License 2.0
69 stars 19 forks source link

WIP: Split `Inactive Ballots by Exhausted Choices` bucket into `Inactive Fully Ranked Ballots by Exhausted Choices` and `Inactive Partially Ranked Ballots by Exhausted Choices` #854

Open yezr opened 1 week ago

yezr commented 1 week ago

Currently we have a row for Inactive Ballots By Exhausted Choices in the summary.csv. We'd like to split this into two buckets

We will use the number configured in Maximum Number of Candidates That Can Be Ranked to differentiate. This will be required! We don't want to assume the number of maximum rankings. It must be explicitly configured.

Image

Open Questions

artoonie commented 1 week ago

What if the config sets the max votes to "MAX", but one of the CVRs has a more strict limit? This can happen with Dominion CVRs. It feels like it may be more legible both in code and in the output files to simply exclude any "inactive by __" if no ballots are inactive in that way?

yezr commented 1 week ago

That's right with multi-vendor CVRs the max ranking could be different for each one...hmm. Made this a WIP and added all the open questions to the original description.

yezr commented 1 week ago

We need some more time to process the open questions. Removing from 2.0

yezr commented 1 week ago

Would it be possible as a stopgap until this is implemented, to put into the audit.log how many total ranks a ballot had when we log that it went exhausted?

artoonie commented 6 days ago

As I was reverting the code to defer this, I found a solution I'm somewhat happy with that I'd like to pitch.

The idea is that we always include as much information as we have, and only exclude the line item in the CSV when two conditions are met:

  1. The config.maxRanking is set to Max, AND
  2. There are either no Dominion files, OR the Dominion-specific Max Ranking is never hit

Concretely, the following table indicates when Inactive Ballots by Exhausted Choices (Fully Ranked) will included in the CSV:

CVR Types config.maxRanking = Max config.maxRanking = a number
Dominion only Include if any ballots match Include
Dominion and others Include if any ballots match* Include
No Dominion Exclude Include

The asterisk denotes the potentially strange case: a multi-vendor tabulation where only Dominion ballots can end up in the fully ranked count. I think this is okay, because it seems reasonable to encourage, but not require, the operator to set the config max ranking to a value other than Max.

I know this discussion could go on for much longer, but wanted to get a gut check on whether this interim solution is acceptable.

artoonie commented 5 days ago

Okay, here's a funny thing that will need to be understood if we do any sort of splitting of fully ranked vs partially ranked:

If a CVR ranks 100% undeclared candidates, that should probably be marked as fully ranked -- but that means you can have inactive CVRs on the second round, which looks wrong but isn't.