estimated error rates for stage/round one RLA batch size

kiniry commented 7 years ago

Stark's calculator uses estimated error rates for:

one vote overstatements of 0.001 (0.1%, or 1 in a thousand audited ballots contain such an error),
two vote overstatements of 0.0001 (0.01%)
one vote understatements of 0.001 (0.1%)
two vote understatements of 0.0001 (0.01%)

@nealmcb suggested initially that we use estimated error rates customized for Colorado of 0.01 for all four of these parameters. Experimenting with such rates leads @dmzimmerman and I to conclude that this is likely too conservative, as the number of ballots necessary to audit is relatively large, albeit comparable to the volume of audits under Colorado's previous statutes (500 ballots or 5% of all ballots, whichever is smaller).

What we really need is information from either the EAC VVSG test reports for Dominion's Democracy Suite, or a direct conversation with Dominion, that tells us what kinds of error rates their equipment witnesses in field use. If we had that data, we could/should use it in our calculations.

CC @dmzimmerman @nealmcb

nealmcb commented 7 years ago

I'm not worried about high Dominion error rates. If those were the only discrepancies, Stark's numbers would be better, though of course we don't have much real data yet. But what data we do have from Colorado audits shows that that there are other conditions under which 2-vote overstatements come in to play. E.g. any phantom ballot has a big chance of being a 2-vote overstatement. Phantom ballots come up when counties haven't finished tabulation (as allowed by Rule 25, e.g. for provisionals), or for any reason can't find a ballot or choose not to reveal it to the audit board and the public, e.g. due to ballot anonymity/linkability concerns. This will vary based on county practice, but has come up in previous pilot audits. That's the main reason I think it would be valuable to make these rates configurable per-county.

dmzimmerman commented 7 years ago

I think the user interface issues involved in making this configurable per county are fairly daunting. Intuition about this sort of number is iffy at best.

sfsinger19103 commented 7 years ago

CDOS has specified that for Round One they want the minimum number of ballots such that if no overstatements are found the audit will end.

nealmcb commented 6 years ago

Setting these via a properties file presumably makes more sense than hard-coding them, and is the easiest improvement to make, at the appropriate time.

When we have data from the November 2017 election, we can revisit this issue and presumably update our default estimates for Round 1 and potentially subsequent rounds. We can also evaluate how these values are configured. So I'll leave this issue open.

FreeAndFair / ColoradoRLA

estimated error rates for stage/round one RLA batch size #422