Open tarheel opened 1 year ago
I don't recall but that seems very plausible. CDF has a notion of GPUnit (no, not the rap group; it's geo-political unit, a Bulgarian pop trio) which is intended to abstract things like precint, precinct-split, district, etc..
So IIRC one could parse a CDF header and discover what labels are available. Maybe easiest thing is just assume that the GPUnit for a CVR is always the precinct, sort by that, and not even worry about such possibilities.
We don't have any test files that have per-ballot precinct data -- any chance you do, or know where to find one?
The code seems like it could support tabulate-by-precinct if we removed the check blocking it, but given that all CDF files I've seen don't have this data,
A secondary issue: looking at the specs: https://pages.nist.gov/ElectionResultsReporting/#17_0_2_4_78e0236_1389366224561_797289_2360 It seems like every ballot can have multiple precincts, which would pose a problem for the whole concept of "tabulate by precinct".
Anyway, I've enabled this, but the only observable effect without having usable data is that the error message is different.
@chughes297 do you have any thoughts on this? I'd feel a bit better getting your sign-off before approving #689.
I was just in Arlington, VA for RCTab stuff and the clerk there mentioned that CDF/Unisyn data does not have reliable precinct information included in the XML. I don't know enough more about the CDF format to say for certain why that might be the case, but it does leave me uncertain about adding this feature...
I'd say it's your call, @artoonie, since you've already put the work in. If you really can't think of any risk to the changes in #689, I'm happy to go forward with approving it.
I think the existing error message is easier to understand for users. I'd prefer to not merge that PR to keep things clean. I'll close this out, and we can revisit if we ever need to do this in the future (in which case we'd also have workable test data).
Re-opening to do some research into CDF Precinct data. I'm going to message John Dziurlaj, father of the CDF, for any insights he might have. @chughes297 is going to try to get some real life CDF CVRs that have the data populated.
@moldover can you remember why we disallowed tabulation by precinct for CDF inputs in #245? I've dug through the code and various email threads and can't find any explanation. Maybe we just didn't bother at the time because it would have taken a bit of extra work?