votingworks / arlo

GNU Affero General Public License v3.0
143 stars 26 forks source link

Parse Hart CVRs for counties with multiple tabulators (unblock OC) #1446

Closed benadida closed 2 years ago

jonahkagan commented 2 years ago

An idea: We can modify the Hart CVR parsing to accept multiple zip exports, one for each tabulator. To map each export to a tabulator in the manifest, the zip file name could be the tabulator name.

jonahkagan commented 2 years ago

When we do this, we need to make sure that we can support large CVRs (on the order of 1-3M ballot records, split across multiple tabulators). A few considerations:

ginvdr commented 2 years ago

Another thing to consider when we do this is how a county might add a placeholder batch... and probably discuss what that looks like scope wise!

arsalansufi commented 2 years ago

A related case for us to cover: When a jurisdiction uses different modes while scanning (we recently learned of a special audit mode). Like the multiple tabulators situation, this situation can also result in duplicate batch names. Unfortunately, unlike the multiple tabulators situation, where CVRs can be exported by tabulator, CVRs can't be exported by mode. Scanned ballot information CSVs, however, can.

See https://votingworks.slack.com/archives/C01DUGG1D8E/p1660149845606469 for an example of this situation

arsalansufi commented 2 years ago

Tracking the file size concerns in a separate issue: https://github.com/votingworks/arlo/issues/1674