usnistgov / ElectionResultsReporting

Common data format specification for election results reporting data
https://pages.nist.gov/ElectionResultsReporting
Other
23 stars 8 forks source link

Final RCV-related changes to V2 #16

Closed johnpwack closed 3 years ago

johnpwack commented 6 years ago

From the discussion and other comments, here's what I think should be done:

That's it. I think rank can be inferred from the actual vote counts.

If no negative comments by end of next week (12/15), I'll make the changes. After that point and with resolution of OtherCounts issue, I believe we are done; I would then update the spec and start it through the NIST pre-publication review.

davekadlecek commented 6 years ago

There are a number of attributes besides vote counts for a candidate that are associated with a particular round in an RCV contest, such as counts of exhausted ballots, of ballots that are overvoted in a round (which, depending on the particular RCV rules being used, might be counted for candidates in earlier and/or later rounds due to a different set of candidates continuing in those rounds), and ballots not counted in a round due to some other anomalous ballot-marking pattern. While I would prefer that these each be reported separately for each round, at a minimum whatever existing count in which they are included (Undervotes or Overvotes or ???) needs to be associated with a round.

There are also some items besides counts of votes for candidates and of various categories of non-votes that are typically associated with a round in a report of results of an RCV contest, candidates defeated (or elected, in the case of STV aka multi-seat RCV) in a round, ballots transferred in a round, and transfer multipliers/weights (in the case of STV). These might not need to be in the Election Results Reporting CDF (though including them allows consumers of the results to explain the round-to-round changes without themselves reconstructing the RCV tabulation), but to the extent that they are they need to be associated with a round.

Is the intention to have separate ElectionReport objects for each round of an RCV contest, where the round would be identified by the (identical) Round attribute of VoteCounts in the contest? Or is the intention that all rounds of an RCV contest could or would be reported in a single ElectionReport object? I think that on general grounds of elegance, the latter is a better way of proceeding, but to do that there would need to be a way to associate with a round those attributes that need to be so associated, whether it is adding a Round attribute to existing elements that include now include them or to which they may be added (e.g., OtherCounts and possibly Candidate or CandidateSelection), adding a separate Round element as a container for the various round-dependent attributes, or adding an optional Round attribute to an existing element that would directly or indirectly contain the round-dependent attributes (probably Contest).

I'm not sure which of these is best for an ultimate future design, but to have minimal changes for ERR 2.0, it's probably best just to add an optional Round attribute to OtherCounts (which is where exhausted ballots, etc., would be counted, whether as separate counts in a new attribute or included in an existing attribute).

davekadlecek commented 6 years ago

For STV (aka multi-seat RCV), the counts of exhausted ballots in virtually all versions, and in some versions of those not counted due to overvoting or to some other anomalous ballot-marking pattern in a round, can be fractional. Thus whatever categories are used to count them, presumably either Undervotes and Overvotes in OtherCounts or new attributes added to OtherCounts, need to be floats rather than integers.

carl3 commented 6 years ago

John, your changes are fine except Round needs to be moved to the base class so it can also qualify UnderVote/OverVote by round.

When writing the detailed documentation be sure to define the semantics of the integer round. Presumably Round 0 means a report of first choice votes, Round 1 up means after eliminations and some subsequent choices are included in the totals. The highest value Round are final results where winners are determined. [What does it mean if Round is omitted on an RCV contest? Round 0?]

There is one additional OtherCount statistic used, BallotsExhausted. It's not required as a declining BallotsCast could be used to compute the exhausted ballots, but in my opinion it's better to emit the value explicitly so BallotsCast is constant and matches the sum of other totals rather than some implicit meaning in missing numbers. It's a minor addition, only applicable to RCV contests.

davekadlecek commented 6 years ago

Moving the optional attribute Round to the base class Counts as Carl suggests is fine as a way of making it available for Overvotes and Undervotes in RCV contests.

On the semantics of the integer Round, some versions of RCV don't have a Round 0 at all, others use it to mean a count of first choice votes, and others use it to mean a count of effective first choice votes (the highest rank used on the ballot, which on a few ballots might not be the first choice but the second or third). Unless there is an intention to use the CDF to force a standardization of RCV methods, the definition should allow the values of Round for a contest to start either with 0 or with 1. I'm not sure whether it would go more into the definition of the RCV method than is appropriate for a reporting CDF to say anything about using round 0 either for a count of first choice votes or a count of effective first choice votes, and that all rounds after the one counting effective first choice votes count votes based on transfers of votes from candidates defeated in a previous round or of surplus votes from candidates elected in a previous round. It is also not quite true that the highest value Round is the one where winners are determined; for example, in STV (multi-winner RCV) some winners can be elected before the final round (and that is generally the case, as that is where surplus votes to transfer from winners come from), and in IRV (single-winner RCV), there are reasons for wanting to continue eliminating candidates down to the last two even after the top candidate surpasses 50% (basically, to see whether or not the winner has an overwhelming mandate, as if a late round was 51% for A, 26% for B, 23% for C, the real meaning could be 51% for A and 49% for anyone-but-A or 74% for A or C and 26% for B) and San Francisco currently does this (see results of Supervisorial District 9 in 2016).

I agree that it is preferable to add BallotsExhausted as an additional OtherCounts statistic. However, I'm not sure it is necessary for V2 or can wait until later. I would note that if it isn't added, the better workaround (which was used in San Francisco's initial RCV implementation) than decreasing BallotsCast is to combine exhausted ballots with undervotes in a single total that would typically increase from round to round, rather than to have undervotes stay constant over the rounds (counting ballots in which the contest was left blank).

johnpwack commented 6 years ago

OK Dave and Carl - so changes to make are:

C'est tout?

davekadlecek commented 6 years ago

John, if BallotsExhausted is added, it should be as an attribute (of type float) to OtherCounts, not as a CandidatePostElection enum value.

Overvotes and Undervotes only need to become round-dependent and able to accommodate non-integer values if they are used to report exhausted ballots (both "exhausted ballots" meaning no more choices for continuing candidates and "ballots exhausted by overvoting" meaning those overvoted only at a lower preference that has come into play after higher preferences are elected/eliminated, which are often reported separately). However, rather than trying to resolve the issue now, maybe it is best to just add BallotsExhausted now and make it, Overvotes and Undervotes all round-dependent floats, leaving to later to determine whether to count vanilla exhausted ballots in BallotsExhausted, Undervotes or both and ballots exhausted by overvoting in BallotsExhausted, Overvotes or both.

johnpwack commented 6 years ago

My mistake in the previous post. I added 'defeated' as a post election status, at the request of several others. My preference is to not add BallotsExhausted to this version, keeping things as simple as we can. What do others of us think?

carl3 commented 6 years ago

Adding 'defeated' is good for several purposes.

In my opinion, adding BallotsExhausted doesn't really add much complexity, but without it, the missing reported content introduces complexity and ambiguity. It's only applicable to RCV contests (as OtherCount), and always exists when there are many more candidates than rankings possible. Trying to misuse UnderVotes or OverVotes is much more complex and creates interoperability problems. Dealing with changing vote counts is more complex than just reporting the exhausted count.

Why not just add what people are using now?

jdmgoogle commented 6 years ago

Thanks for centralizing this discussion. I have a few observations and questions:

  1. I support the idea of having all this in one file as opposed to spread out over multiple files.
  2. Why are the ballot counts floats? In general, that is.
  3. I'm assuming that the plan is to use the "defeated" status for a candidate in general, and not apply it to a specific round?
  4. While hoping to establish best practices for which round is officially the start of RCV (i.e., 0 or 1) my experience is that that's hopelessly optimistic. :) For example, with sequenceOrder for contests and candidates, the only rule we've seen is "larger numbers should be placed after smaller numbers." I've seen feeds in which all federal contests are assigned single-and-double digit numbers, state contests start at 100, and local contests start at 1000.
  5. I'd prefer to add as few new elements as possible, and have those elements tied to specific, broad use cases; i.e., use cases which are known to span beyond one jurisdiction.

I think Mark is going to weigh in with some other feedback, too.

JDziurlaj commented 6 years ago

Have all these changes been made to everyone's satisfaction? Can this issue be closed?

JDziurlaj commented 3 years ago

There has been no action on this issue in over two years. I am closing it.