BrightSpots / rcv

Ranked Choice Voting Universal Tabulator
Mozilla Public License 2.0
71 stars 19 forks source link

Updating Error Messaging for CVRs #632

Closed rrojas350 closed 1 year ago

rrojas350 commented 1 year ago

See Issue #624 for details.

When RCTab encounters an error with a CVR the messaging needs to be clearer than an NPE and let the user know if something is missing.

tarheel commented 1 year ago

Are there other known cases in which this happens? In any case, I guess the action item for this issue is to perform an audit of the CVR ingestion code to try to fix any other places where it might occur.

rrojas350 commented 1 year ago

Nothing comes to mind but this issue caused enough of a stir that if we can be proactive about a fix it's worth the effort.

moldover commented 1 year ago

Hey guys, exciting to see the recent activity here. I wanted to give some thoughts on CDF and how RCTab might approach dealing with it.

TL;DR: update RCTab CDF parsing code in response to actual vendor CDF implementations and the schema they expect / provide.

CDF is an abstract data schema designed with flexibility in mind. It allows for many different implementations. See the uml diagram in our repo to appreciate that, many of the components (including GPUnit) are totally optional (denoted by 1..0 which indicates the element may appear 1 or 0 times) and this is still "valid" CDF. Many of the elements can be in an array (or not) denoted by 1..* or (the asterisk means "any number of instances").

So, it's certainly possible to build code in RCTab which validates input data against the entire schema (and https://github.com/JDziurlaj has done this) but for example, a missing GPUnit is still technically valid CDF. So what should RCTab do in this case?

https://github.com/JDziurlaj has put forth the idea of CDF "profiles" which are basically just a concrete implementation of CDF, which implement some subset of the elements. (btw he is the foremost expert on CDF so send your questions his way) In this new example, it appears the input data has no GPUnits. Is that valid? What of the other problems @tarheel found? Are they actual problems, and which are ok? Only the CDF data provider can answer these questions, and so I would suggest that RCTab extend its code to ingest new data providers as they emerge.

There are I'm sure many other general improvements which could be done, and I would agree that any input which throws a NPE should at least be caught and translated into a generic parse error.

artoonie commented 1 year ago

I believe the easy version of this is done by #309.

We should get JDiurlaj's implementation by April 1 and we can determine then if it's integratable here: https://github.com/usnistgov/cdf-test-method

Is there anything else not encompassed by these two? If not, perhaps we can rename this task to integrate (or explore integrating) that repo.

artoonie commented 1 year ago

The repo is now live. I have a few concerns with using it:

  1. Ongoing Maintenance Cost: The schemas may not cover all possible file formats. We'd likely need to maintain our own fork of that repo to ensure that we can rapidly add our own schemas.
  2. Dependencies: It relies heavily on four external dependencies: MorganaXProc [GPLv3], Saxon-HE 11.5 [MPLv2], schxslt [MIT], Xerces 2.12.2 [Apache]. What are the implications?
  3. Effectiveness: Most NPEs seem unlikely to be caught by schema validation -- my read is that they're most likely caused by data issues, not format issues.
  4. Doesn't achieve goal of human-friendly errors: Converting the errors it provides to something human-readable might take more code than just checking each item ourselves.

I don't have as much familiarity as y'all do with what causes NPEs, but I'm not sure schema validation will be worth the implementation effort currently.

My gut feeling here is that we already have a basic validation system, and we should clean and improve that over time as we get reports of NPEs. If users are frequently complaining about NPEs with unknown causes, having a vague error message won't help any more than the NPE error message. In sum: we should punt on this task, and instead focus on gathering as many files that create NPEs and figuring out user-friendly messaging one at a time while hardening our existing validation system.

I'd love to hear thoughts from those more familiar with this code. @tarheel @HEdingfield?

HEdingfield commented 1 year ago

I think #644 made some progress on this.

artoonie commented 1 year ago

I'm not sure what additional immediate action can be taken here -- can we consider this closed until we have a more concrete plan, or move this out of 1.4.0 requirements so we can sit on it longer and come up with a plan?

HEdingfield commented 1 year ago

Agreed, I'd suggest we open a new reminder issue outside of v1.4.0 and link back to your nice summary in https://github.com/BrightSpots/rcv/issues/632#issuecomment-1495046567.

Then we can close this so it's still in the record as being some work we did for 1.4.0.

artoonie commented 1 year ago

SGTM. Issue linked above.