BrightSpots / rcv

Ranked Choice Voting Universal Tabulator
Mozilla Public License 2.0
70 stars 19 forks source link

support tabulating by precinct for CDF input #623

Open tarheel opened 1 year ago

tarheel commented 1 year ago

@moldover can you remember why we disallowed tabulation by precinct for CDF inputs in #245? I've dug through the code and various email threads and can't find any explanation. Maybe we just didn't bother at the time because it would have taken a bit of extra work?

moldover commented 1 year ago

I don't recall but that seems very plausible. CDF has a notion of GPUnit (no, not the rap group; it's geo-political unit, a Bulgarian pop trio) which is intended to abstract things like precint, precinct-split, district, etc..
So IIRC one could parse a CDF header and discover what labels are available. Maybe easiest thing is just assume that the GPUnit for a CVR is always the precinct, sort by that, and not even worry about such possibilities.

artoonie commented 1 year ago

We don't have any test files that have per-ballot precinct data -- any chance you do, or know where to find one?

The code seems like it could support tabulate-by-precinct if we removed the check blocking it, but given that all CDF files I've seen don't have this data,

A secondary issue: looking at the specs: https://pages.nist.gov/ElectionResultsReporting/#17_0_2_4_78e0236_1389366224561_797289_2360 It seems like every ballot can have multiple precincts, which would pose a problem for the whole concept of "tabulate by precinct".

Anyway, I've enabled this, but the only observable effect without having usable data is that the error message is different.

HEdingfield commented 1 year ago

@chughes297 do you have any thoughts on this? I'd feel a bit better getting your sign-off before approving #689.

chughes297 commented 1 year ago

I was just in Arlington, VA for RCTab stuff and the clerk there mentioned that CDF/Unisyn data does not have reliable precinct information included in the XML. I don't know enough more about the CDF format to say for certain why that might be the case, but it does leave me uncertain about adding this feature...

HEdingfield commented 1 year ago

I'd say it's your call, @artoonie, since you've already put the work in. If you really can't think of any risk to the changes in #689, I'm happy to go forward with approving it.

artoonie commented 1 year ago

I think the existing error message is easier to understand for users. I'd prefer to not merge that PR to keep things clean. I'll close this out, and we can revisit if we ever need to do this in the future (in which case we'd also have workable test data).

yezr commented 4 months ago

Re-opening to do some research into CDF Precinct data. I'm going to message John Dziurlaj, father of the CDF, for any insights he might have. @chughes297 is going to try to get some real life CDF CVRs that have the data populated.