Closed mattbeattie closed 2 years ago
~Below is an "empty" example CVR with no actual CVRs. It just attempts to construct the basics of the CVR, given example_1 as the input.~
~As you can see, a lot of things are unknown because I didn't know where to get the data from the EDF. I'm also not convinced that I did the Contest
array correctly - especially when it comes to write-ins and ballot measures.~
~I'm going to put a pin in this for now. I can't really move forward with the CVR array generation until I know how many CVRs to generate (question 2 above), and there are a number of things that need to be clarified before I can fill in the missing gaps below.~
edit: removed the example since I'll have a better one below
Thanks to Cliff, I made some more progress. Most recent status in a comment below this one.
This is quite a lot of questions, @mattbeattie . I'll answer them as quickly as I can today.
I'm working on updating the set of CVRs that correspond with the Jetsons EDFs. Still a WIP, but you can look on the jetsons branch
Looking at question 2 above:
How do I know how many CVRs to generate in the top-level CVR array? The spec says "one per cast vote record in the report" and also "Each sheet of a multi-page paper ballot is represented by an individual CVR." How do I know how many CVRs I need to generate? Given that Markit's Ballot Markup application will show one contest at a time, does that mean we'll have one CVR per contest?
Remember the primary purpose of the CVR (section 2.1 of the spec) is to provide a common output format for ballot-scanning devices to record voter choices. For a device that is scanning paper ballots, it may make sense to produce one record per page of a multi-page ballot, but there is no reason why MarkIt shouldn't produce a single CVR containing all the voter's choices.
For question 9:
CVRs have BallotImage as an optional top-level property, defined as "An image of the ballot sheet created by the scanning device." Given that the Markit application does not have images of the ballot, can we simply not provide this? Can we safely ignore anything related to images?
I'd say yes: we can ignore anything related to images of paper ballots, because Markit is not a scanner. Remember that these NIST-1500 formats were designed to cover a number of use cases/scenarios, and that not all elements/attributes apply to all cases.
@cwulfman thank you!
Let me take another pass at this. I want to generate the full CVR with the user's choices and everything. I'll keep things as simple as possible while still following the spec. I'll continue to use placeholders for "unknown" values, and we can address as needed.
I'll update this issue again once I've got that working.
Two big updates:
I've updated my top-level comment accordingly.
There are now 7 real questions for which I'll need answers. Most ask "where do I get X from the EDF?" Until I have answers, the gaps will be evident in the "unknown" values - which is totally okay for now, I think.
Click below to expand the example! This is an "empty" CVR, in that it does not have any markup applied to it. That said, it is functionally complete and contains all required values.
The next step is to "merge" the voter's selections (contained in the ballot markup model) into the generated empty CVR. More to come.
{ "CVR": [ { "CurrentSnapshotId": "current snapshot ID unknown!", "CVRSnapshot": [ { "@id": "snapshot ID unknown!", "CVRContest": [ { "ContestId": "contest-orbit-city-mayor", "CVRContestSelection": [ { "ContestSelectionId": "selection_spacely", "SelectionPosition": [ { "HasIndication": "no", "IsAllocable": "yes", "NumberVotes": 0, "@type": "CVR.SelectionPosition" } ], "TotalNumberVotes": 0, "@type": "CVR.CVRContestSelection" }, { "ContestSelectionId": "selection_cogswell", "SelectionPosition": [ { "HasIndication": "no", "IsAllocable": "yes", "NumberVotes": 0, "@type": "CVR.SelectionPosition" } ], "TotalNumberVotes": 0, "@type": "CVR.CVRContestSelection" }, { "ContestSelectionId": "selection_writein_1", "SelectionPosition": [ { "HasIndication": "no", "IsAllocable": "yes", "NumberVotes": 0, "@type": "CVR.SelectionPosition" } ], "TotalNumberVotes": 0, "@type": "CVR.CVRContestSelection" } ], "@type": "CVR.CVRContest" }, { "ContestId": "contest-spaceport-control-board", "CVRContestSelection": [ { "ContestSelectionId": "selection_jetson", "SelectionPosition": [ { "HasIndication": "no", "IsAllocable": "yes", "NumberVotes": 0, "@type": "CVR.SelectionPosition" } ], "TotalNumberVotes": 0, "@type": "CVR.CVRContestSelection" }, { "ContestSelectionId": "selection_ellis", "SelectionPosition": [ { "HasIndication": "no", "IsAllocable": "yes", "NumberVotes": 0, "@type": "CVR.SelectionPosition" } ], "TotalNumberVotes": 0, "@type": "CVR.CVRContestSelection" }, { "ContestSelectionId": "selection_indexer", "SelectionPosition": [ { "HasIndication": "no", "IsAllocable": "yes", "NumberVotes": 0, "@type": "CVR.SelectionPosition" } ], "TotalNumberVotes": 0, "@type": "CVR.CVRContestSelection" }, { "ContestSelectionId": "selection_writein_2", "SelectionPosition": [ { "HasIndication": "no", "IsAllocable": "yes", "NumberVotes": 0, "@type": "CVR.SelectionPosition" } ], "TotalNumberVotes": 0, "@type": "CVR.CVRContestSelection" }, { "ContestSelectionId": "selection_writein_3", "SelectionPosition": [ { "HasIndication": "no", "IsAllocable": "yes", "NumberVotes": 0, "@type": "CVR.SelectionPosition" } ], "TotalNumberVotes": 0, "@type": "CVR.CVRContestSelection" } ], "@type": "CVR.CVRContest" }, { "ContestId": "ballotmeasure-1", "CVRContestSelection": [ { "ContestSelectionId": "ballotmeasure-1_yes", "SelectionPosition": [ { "HasIndication": "no", "IsAllocable": "yes", "NumberVotes": 0, "@type": "CVR.SelectionPosition" } ], "TotalNumberVotes": 0, "@type": "CVR.CVRContestSelection" }, { "ContestSelectionId": "ballotmeasure-1_no", "SelectionPosition": [ { "HasIndication": "no", "IsAllocable": "yes", "NumberVotes": 0, "@type": "CVR.SelectionPosition" } ], "TotalNumberVotes": 0, "@type": "CVR.CVRContestSelection" } ], "@type": "CVR.CVRContest" } ], "Type": "original", "@type": "CVR.CVRSnapshot" } ], "ElectionId": "election ID unknown!", "SequenceNumber": "sequence number unknown!", "@type": "CVR.CVR" } ], "Election": [ { "@id": "election ID unknown!", "Candidate": [ { "@id": "candidate_spacely", "Code": [ { "Type": "candidate code type unknown!", "Value": "candidate_spacely", "@type": "CVR.Code" } ], "@type": "CVR.Candidate" }, { "@id": "candidate_cogswell", "Code": [ { "Type": "candidate code type unknown!", "Value": "candidate_cogswell", "@type": "CVR.Code" } ], "@type": "CVR.Candidate" }, { "@id": "candidate_jetson", "Code": [ { "Type": "candidate code type unknown!", "Value": "candidate_jetson", "@type": "CVR.Code" } ], "@type": "CVR.Candidate" }, { "@id": "candidate_ellis", "Code": [ { "Type": "candidate code type unknown!", "Value": "candidate_ellis", "@type": "CVR.Code" } ], "@type": "CVR.Candidate" }, { "@id": "candidate_indexer", "Code": [ { "Type": "candidate code type unknown!", "Value": "candidate_indexer", "@type": "CVR.Code" } ], "@type": "CVR.Candidate" } ], "Code": [ { "Type": "election code type unknown!", "Value": "election ID unknown!", "@type": "CVR.Code" } ], "Contest": [ { "@id": "contest-orbit-city-mayor", "Code": [ { "Type": "contest code type unknown!", "Value": "contest-orbit-city-mayor", "@type": "CVR.Code" } ], "ContestSelection": [ { "@id": "selection_spacely", "CandidateIds": ["candidate_spacely"], "@type": "CVR.CandidateSelection" }, { "@id": "selection_cogswell", "CandidateIds": ["candidate_cogswell"], "@type": "CVR.CandidateSelection" }, { "@id": "selection_writein_1", "CandidateIds": [], "@type": "CVR.CandidateSelection" } ], "@type": "CVR.CandidateContest" }, { "@id": "contest-spaceport-control-board", "Code": [ { "Type": "contest code type unknown!", "Value": "contest-spaceport-control-board", "@type": "CVR.Code" } ], "ContestSelection": [ { "@id": "selection_jetson", "CandidateIds": ["candidate_jetson"], "@type": "CVR.CandidateSelection" }, { "@id": "selection_ellis", "CandidateIds": ["candidate_ellis"], "@type": "CVR.CandidateSelection" }, { "@id": "selection_indexer", "CandidateIds": ["candidate_indexer"], "@type": "CVR.CandidateSelection" }, { "@id": "selection_writein_2", "CandidateIds": [], "@type": "CVR.CandidateSelection" }, { "@id": "selection_writein_3", "CandidateIds": [], "@type": "CVR.CandidateSelection" } ], "@type": "CVR.CandidateContest" }, { "@id": "ballotmeasure-1", "Code": [ { "Type": "contest code type unknown!", "Value": "ballotmeasure-1", "@type": "CVR.Code" } ], "ContestSelection": [], "@type": "CVR.CandidateContest" } ], "@type": "CVR.Election" } ], "GeneratedDate": "2062-01-01T12:00:00-08:00", "GpUnit": [ { "@id": "state_farallon", "Type": "state", "@type": "CVR.GpUnit" }, { "@id": "county_gadget", "Type": "county", "@type": "CVR.GpUnit" }, { "@id": "district_orbit_city", "Type": "city", "@type": "CVR.GpUnit" }, { "@id": "district_aldrin_spaceport", "Type": "city", "@type": "CVR.GpUnit" }, { "@id": "precinct_1_downtown", "Type": "precinct", "@type": "CVR.GpUnit" }, { "@id": "precinct_2_spacetown", "Type": "precinct", "@type": "CVR.GpUnit" }, { "@id": "precinct_3_spaceport", "Type": "precinct", "@type": "CVR.GpUnit" }, { "@id": "precinct_4_bedrock", "Type": "precinct", "@type": "CVR.GpUnit" } ], "ReportGeneratingDeviceIds": ["election ID unknown!"], "ReportingDevice": [ { "@id": "reporting device id unknown!", "Application": "Trust the Vote application", "@type": "CVR.ReportingDevice" } ], "Version": "1.0.0" }
In the top-level CVR array, what is the SequenceNumber used for? It's not listed in the spec. The value does not appear to be sequential or even unique. Where do we get this from the EDF?
from the spec documentation (p. 67):
Each CVR element also includes an optional sequence number (SequenceNumber); this isn’t required but could be helpful to auditors.
This is, then, for the case in which a CastVoteRecordReport contains many CVRs. Not applicable in our case.
Will the top-level ReportGeneratingDeviceIds always have exactly one element, and will that element correspond to the election ID?
I don't think we've ever talked about what a reporting device is in our context.
ReportingDevice is used to specify a voting device as the “political geography” at hand. CastVoteRecordReport refers to it as ReportGeneratingDevice and uses it to specify the device that created the CVR report. CVR refers to it as CreatingDevice to specify the device that created the CVRs.
I defer to @trustthevote to answer this
Where do we get the election ID from the EDF?
Remember that in XML, ID/IDRef attributes are local to the document. The CastVoteRecordReport requires at least one Election
element, and the Election
element requires a value for the ObjectId
attribute (the value must be an xsd:identifier). The CVRR can contain more than one Election
element, so the ElectionId/ObjectId pairing serves to link a CVR to the election to which it applies. Within the context of the CVRR, that ObjectId
can be an arbitrary, document-specific identifier, which your application, presumably, would generate.
NIST-1500-100 defines an optional ExternalIdentifier
property for its Election
element, but it isn't clear how that identifier, if it exists, should be used in the CVRR: perhaps at the value of the Code
element.
This is another place where these two schemas, while related, do not dovetail precisely.
In the top-level CVR array, the CurrentSnapshotId corresponds to the single CVRSnapshot element's @id value. Where does this value come from in the EDF?
It doesn't come from the EDF. As in the ElectionIdentifier discussion above, these are ID/IDRef pairs, local to the scope of the XML document.
This is also an example, by the way, of a fundamental difference between XML and JSON: there is no JSON equivalent of ID-IDREF in JSON.
Per the spec, the CVRSnapshot's Type can be either original (As scanned, no contest rules applied), modified (After contest rules applied), or interpreted (Has been adjudicated). Where do I get this information from the EDF?
You might want to review the background information in the NIST-1500-103 specification; it talks about the motivation for the CVR format. The short answer is that there is nothing in the EDF that corresponds with CVRSnapshots or their types; you should always make this type's value original
because the value coming from your app is equivalent to what would come out of a scanner.
All Codes in the example have a Type of local-level. Per the spec, there are several enumerated values for this. In a given election, would all Types be the same like in the example? Where do we get this information from in the EDF?
I don't think we're making use of Codes, at least not at this point. These seem to be something added to the specification to make it conformant with other formats.
The example's Contest has a Type of CVR.CandidateContest. The spec allows for both PartyContest and BallotMeasureContest. I assume we do not need to know about party contests for now. Given that we have an EDF which contains ballot measures, can we get an example CVR which contains a ballot measure contest?
All four CVRRs in the test cases directory include a ballot-measure contest.
Thanks a ton, Cliff. With your help, I've solved all my questions about CVRs (quite a feat!) except for the one about reporting devices.
For simplicity's sake, I'll go ahead and close this (very large) issue out, and we can sync up about reporting devices later on.
Thanks again!
Initial questions below.
As I continued my research, I answered a lot of these and/or made reasonable assumptions about what we should do. For posterity, I kept the questions and my answers below.
At this time I'm able to generate a usable CVR; however, that CVR has gaps. Answers to the unanswered questions below would help me fill those gaps.
Unanswered
ReportGeneratingDeviceIds
always have exactly one element, and will that element correspond to the election ID?Answered / made reasonable assumptions
CVR
s to generate in the top-levelCVR
array?~ Answer: one single CVR for everything!ReportingDevice
object has both aSerialNumber
and an@id
. Where do we get these from the EDF?~ Answer:SerialNumber
is optional! We can (and should) use theApplication
instead, and should set it toTrust the Vote application
per Cliff's example hereBallotImage
as an optional top-level property, defined as "An image of the ballot sheet created by the scanning device." Given that the Markit application does not have images of the ballot, can we simply not provide this? Can we safely ignore anything related to images?~ Answer: yes, we can safely ignore anything related to imagesCVR
array, will theCVRSnapshot
property always only contain one object? The example only has 1, but the spec theoretically accounts for many of them. Can we just always use 1?~ Answer: executive decision: we'll just always use 1CVRContest
has aStatus
, which can be undervoted, overvoted, not-indicated, invalidated-rules, or other. The spec enumerates them well. When the user undervotes (which the ballot markup application DOES allow), do we need to put "undervoted", and if so, do we need to set theUndervotes
? What do we put here if the voter votes exactly the number of votes?~ Answer: optional, won't use itAnnotation
, which is defined as "used to record annotations made by one or more adjudicators." Given that the Markit application does not have adjudicators, can we simply never provide this optional value?~ Answer: it's optional, so we just won't provide it!CVRContest
's spec defines aWriteIns
integer, which is "The total number of write-ins in the contest." However, the example does not provide this value - even when the CVR contains write-ins! Is it optional, even when there are write-ins? Can we just never provide this?~ Answer: if it's optional and we don't need it, we won't provide it!SelectionPositions
, the spec defines theMarkMetricValue
as "MarkMetricValue specifies the measurement of a mark on a paper ballot. The measurement is assigned by the scanner for measurements of mark density or quality and would be used by the scanner to indicate whether the mark is a valid voter mark representing a vote or is marginal." We won't have scanners, given that we're digital. Per the example, it's optional. Can we simply not provide it?~ Answer: optional, won't provide it since we don't have anything related to imagesContestSelection
'sTotalNumberVotes
sounds like it would be the sum of the votes in theSelectionPosition
array, and the spec seems to agree: "For cumulative or range and other similar voting variations, contains the total number of votes across all indications/marks." Unfortunately, the example does not adhere to this logic. It has two selections (each with 1 vote) but with aTotalNumberVotes
of 0. How should Markit compute this field?~ Answer: executive decision - use mathSelectionPositions
, will theNumberVotes
value always be 1? The spec states "The number of votes represented by the position, usually 1 but may be more depending on the voting method." What voting methods would have more than one? Can we assume "always 1" for now?~ Answer: just going to assume 1 for nowSelectionPositions
, the spec defines thePosition
as "The ordinal position of the selection position within the contest option." What does this mean? Per the example, it's optional and appears to only show up whenMarkMetricValue
shows up. Can we just not provide either of them?~ Answer: optional, won't provide it since we don't have anything related to imagesSelectionPositions
, the spec defines theStatus
as "Status of the position, e.g., “generated-rules” for generated by the machine, from the PositionStatus enumeration. If no values apply, use ‘other’ and include a user-defined status in OtherStatus." What does this mean? It appears this is optional. Where would we get this info from the EDF? Can we simply not provide it?~ Answer: optional, won't provide it since we don't have anything related to imagesWriteInImage
object, which allows for a base64 encoded image and a corresponding hash. We won't have in the digital ballot entry method. It seems the spec allows for 0..1WriteInImage
s in theCVRWriteIn
class; however, the example always provides aWriteInImage
whenever there's a write-in. Is it okay to not provide theWriteInImage
, only providing theText
corresponding to the user's write-in entry?~ Answer: no images! Just don't provide it as long as it's optional.Candidate
has aCode
array. Will this Code array always have exactly 1 element as the example suggests? The spec allows for multiple (or none!). What would happen in those cases?~ Answer: keep it simple. I can generate one just fine, so that's what i'm going to doCandidate
'sCode
array's single element has aValue
which is equivalent to the Candidate's ID. Will this always be the case? How do we get theType
?~ Answer: redundant question. keep it simpleContest
'sCode
's single element object (will it always have exactly one?) has aValue
which corresponds to theContest
's@id
BUT without the prefixed underscore. Why is it missing the underscore? What is the significance of the underscore? Do we need to inject it in some places, but not in others? It's worth mentioning that theCandidates
'Code
(again, exactly 1 element in theCode
array) has aValue
which corresponds to theCandidate
's@id
and RETAINS the prefix underscore! Why does theCandidates
'Code
'sValue
keep the underscore, but theContest
'sCode
'sValue
removes it?~ Answer: no idea what's going on here. I'm just going to generate a single element whoseValue
has the contest's ID. I'm not going to worry about conditionally prefixing things with underscores unless told to do soCode
array has exactly 1 element. Will this always be the case? Additionally, the Value is_${electionID}
- will that always be the case? What's the point of the underscore? Is_EL7
the election ID or is itEL7
?~ Answer: I really hope we can just not care about this conditional underscore prefix thingSelectionPosition
class, there is an attribute calledIsAllocable
with the description "Whether this indication should be allocated to the contest option’s accumulator." How do we know if this should be true or not?~ Answer: I'm going to assume this will always be true unless told otherwiseCVR
array, what is theSequenceNumber
used for? It's not listed in the spec. The value does not appear to be sequential or even unique. Where do we get this from the EDF?~ Answer: not applicable!CVRSnapshot's
Type
can be either original (As scanned, no contest rules applied), modified (After contest rules applied), or interpreted (Has been adjudicated). Where do I get this information from the EDF?~ Answer: it'll always be originalCVR
array, theCurrentSnapshotId
corresponds to the singleCVRSnapshot
element's@id
value. Where does this value come from in the EDF?~ Answer: it's local to the document, so it can be whatever we wantCodes
in the example have aType
oflocal-level
. Per the spec, there are several enumerated values for this. In a given election, would allTypes
be the same like in the example? Where do we get this information from in the EDF?~ **Answer: this isn't relevant to us, so I'll mark them all asN/A
Contest
has aType
ofCVR.CandidateContest
. The spec allows for bothPartyContest
andBallotMeasureContest
. I assume we do not need to know about party contests for now. Given that we have an EDF which contains ballot measures, can we get an example CVR which contains a ballot measure contest?~ Answer: Cliff gave us a lot!CVR Spec: https://pages.nist.gov/CastVoteRecords
Example CVR in JSON format: https://github.com/HiltonRoscoe/CDFPrototype/blob/master/CVR/json/example_2.json