votingworks / electionguard-kotlin-multiplatform

An implementation of ElectionGuard version 2.0.0 in Kotlin.
MIT License
9 stars 5 forks source link

Implementing write-in counting #167

Open JohnLCaron opened 2 years ago

JohnLCaron commented 2 years ago

"Write-ins are assumed to be explicitly registered or allowed to be lumped into a single “write-ins” category for the purpose of verifiable tallying. Verifiable tallying of free-form write-ins may be best done with a MixNet23 design." (p 15, spec 1.51)

Currently we have no explicit processing of write-ins.

Note that the spec does not currently describe the input PlaintextBallot. So theres no explicit specification about what a ballot marking device or scanner should do with write-ins.

A PlaintextBallot Selection has a selectionId, which is supposed to match a Manifest selectionId. But it wont unless all write-in Candidates are registered and added as a selection on the Manifest Contest. Call this the "explicitly registered" Option 1. In this case, theres nothing extra to do, if we assume that the scanner correctly identifies the write-in and creates the correct PlaintextBallot selection. The only difference between a regular candidate and a write-in candidate is that the write-in candidate doesnt appear on the ballot and must be (correctly) written in. Theres no need for adding to the ContestData (except for overvotes, TBD).

Another possibility is that the scanner adds any write-ins to the PlaintextBallot. Currently, we have an "extendedData" string (note: should renamed to "writeIn") on the Selection (should be placed on the Contest, not the Selection).

Option 2 is when write-ins are not pre-registered. The scanner puts any write-ins into the PlaintextBallot.Contest, and we add them to the EncryptedBallot Contest's ContestData field. In order to see if its a write-in, we have to decrypt ContestData.

Option 3 adds a "lumped write-in" selection to every contest. It does not have a matching Manifest selection. It records if there is a valid write-in that was voted for. These are encoded in the normal way, count against the contest limit, etc. Then one can find out if there are enough write-ins to affect the election. If so, the the ContestData is decoded and the actual writeins are tallied. In this case, we are back to adding selections to the contest, increasing the computational burden. If Contest.votes_allowed > 1, we need multiple ones, unless we can use range_proofs and allow Selection.vote > 1.

So it seems to me that there are three cases:

  1. "Explicitly Registered", write-ins have selections in the Manifest and need no special processing by us.

  2. "Free-form Write-Ins": write-ins are encoded in the ContestData which must be decoded to be counted.

  3. "Lumped Write-Ins": in addition to the ContestData, every contest has a "lumped write-in" selection which records if there are valid write-in vote(s), and are tallied as usual, without needing to decode the ContestData at the same time.

JohnLCaron commented 2 years ago

thoughts, @danwallach ??

JohnLCaron commented 2 years ago

Ive implemented a first pass in PR#170 Will leave this open for further discussion and refinement.

danwallach commented 2 years ago

Case 1: Explicitly registered. This sounds like it's just a special form of a candidate. Maybe we just add a "write-in" boolean to the definition of a candidate and that's completely it. It's now the voting machine's problem to figure out how to deal with mapping from a voter's free-text input to a "registered" candidate. This seems... difficult in practice, but that's "not our problem."

Case 2 or 3. First, the contest should have a flag set on whether write-ins are allowed at all. If the flag is false, then there's no write-in at all, and no need for special support to handle it. Otherwise, we'd have one or more "candidates" with names like "Write-in (1)" and "Write-in (2)" that have a flag set and are presented to the voter as blank lines that expand to text entry boxes in some machine-specific fashion.

My thinking on this is that we're back to the same ContestData discussion we had a while back. If we decided that the ContestData field was meant to be the voter's original intent (before any overvote processing or anything else at all), then we have a general-purpose encrypted data structure that we might subsequently shove through a mixnet prior to tallying. We could fake this by having the trustees decrypt each and every ContestData field, skip the mixing, and just publish an array of the resulting plaintexts. This would only ever happen if there were enough write-in fields used, in total, to have a possibility of winning the contest.

danwallach commented 2 years ago

Turns out, the VVSG standards have a lot to say about this (borrowed from an issue on a VotingWorks thread):

Doing these things in the context of homomorphic tallying is potentially complicated, especially the bit about a voter who tries to vote for a candidate under write-ins and normally.

The more I think about this, the more my brain hurts.

JohnLCaron commented 2 years ago

Ok, I havent absorbed all that; I presume much of it is for the election system (ES), and we just have to add whatever hooks are needed in our library to implement the various options.

My first pass implementation assumes that the ES gives us a list of write-in strings per contest in the input PlaintextBallot. Those go into the ContestData record which is encrypted and added to the EncryptedBallot. I have not yet added a lumped write-in selection, waiting for more spec. When decrypting ballots, I always decrypt the ContestData along with the selections using the guardian shares. This is adding ~10% to the cost of decrypting. That has let me test and verify encrypt/decrypt ContestData with minimal disruption to the workflow. I assume it will all change and complexify going forward.

One more detail is that write-ins are count against the limit when detecting overvotes. An overvote triggers setting all selections in the contest to 0. The overvotes are recorded in the ContestData, so the original ballot can be fully recovered.

If the ES can handle the "Reconciliation of double votes" and " Reconciliation of aliases" before sending us the PlaintextBallot, then that logic can stay out of our library.

danwallach commented 2 years ago

I'm thinking we should organize a "write-in summit" (i.e., a one hour Zoom) where the goal is to nail down all these particular details rather than just throwing an implementation together.

JohnLCaron commented 1 year ago

I need to do an implementation before I understand these things very deeply. The only danger of that is to not let prototype implementations become the spec. And it may be that this is complicated enough that you want to do an implementation before you finalize it.

So, Im ok with a summit, with the caveat that sometimes writing alternatives down to think about beforehand can help.