TrustTheVote-Project / NIST-1500-100-103-examples

Other
3 stars 5 forks source link

Questions from looking at the example CVR #40

Closed mattbeattie closed 2 years ago

mattbeattie commented 2 years ago

Initial questions below.

As I continued my research, I answered a lot of these and/or made reasonable assumptions about what we should do. For posterity, I kept the questions and my answers below.

At this time I'm able to generate a usable CVR; however, that CVR has gaps. Answers to the unanswered questions below would help me fill those gaps.

Unanswered

  1. Will the top-level ReportGeneratingDeviceIds always have exactly one element, and will that element correspond to the election ID?

Answered / made reasonable assumptions

CVR Spec: https://pages.nist.gov/CastVoteRecords

Example CVR in JSON format: https://github.com/HiltonRoscoe/CDFPrototype/blob/master/CVR/json/example_2.json

mattbeattie commented 2 years ago

~Below is an "empty" example CVR with no actual CVRs. It just attempts to construct the basics of the CVR, given example_1 as the input.~

~As you can see, a lot of things are unknown because I didn't know where to get the data from the EDF. I'm also not convinced that I did the Contest array correctly - especially when it comes to write-ins and ballot measures.~

~I'm going to put a pin in this for now. I can't really move forward with the CVR array generation until I know how many CVRs to generate (question 2 above), and there are a number of things that need to be clarified before I can fill in the missing gaps below.~

edit: removed the example since I'll have a better one below

Thanks to Cliff, I made some more progress. Most recent status in a comment below this one.

cwulfman commented 2 years ago

This is quite a lot of questions, @mattbeattie . I'll answer them as quickly as I can today.

cwulfman commented 2 years ago

I'm working on updating the set of CVRs that correspond with the Jetsons EDFs. Still a WIP, but you can look on the jetsons branch

cwulfman commented 2 years ago

Looking at question 2 above:

How do I know how many CVRs to generate in the top-level CVR array? The spec says "one per cast vote record in the report" and also "Each sheet of a multi-page paper ballot is represented by an individual CVR." How do I know how many CVRs I need to generate? Given that Markit's Ballot Markup application will show one contest at a time, does that mean we'll have one CVR per contest?

Remember the primary purpose of the CVR (section 2.1 of the spec) is to provide a common output format for ballot-scanning devices to record voter choices. For a device that is scanning paper ballots, it may make sense to produce one record per page of a multi-page ballot, but there is no reason why MarkIt shouldn't produce a single CVR containing all the voter's choices.

cwulfman commented 2 years ago

For question 9:

CVRs have BallotImage as an optional top-level property, defined as "An image of the ballot sheet created by the scanning device." Given that the Markit application does not have images of the ballot, can we simply not provide this? Can we safely ignore anything related to images?

I'd say yes: we can ignore anything related to images of paper ballots, because Markit is not a scanner. Remember that these NIST-1500 formats were designed to cover a number of use cases/scenarios, and that not all elements/attributes apply to all cases.

mattbeattie commented 2 years ago

@cwulfman thank you!

Let me take another pass at this. I want to generate the full CVR with the user's choices and everything. I'll keep things as simple as possible while still following the spec. I'll continue to use placeholders for "unknown" values, and we can address as needed.

I'll update this issue again once I've got that working.

mattbeattie commented 2 years ago

Two big updates:

1. I answered / made reasonable assumptions for the majority of my questions

I've updated my top-level comment accordingly.

There are now 7 real questions for which I'll need answers. Most ask "where do I get X from the EDF?" Until I have answers, the gaps will be evident in the "unknown" values - which is totally okay for now, I think.

2. I can now generate a CVR!

Click below to expand the example! This is an "empty" CVR, in that it does not have any markup applied to it. That said, it is functionally complete and contains all required values.

The next step is to "merge" the voter's selections (contained in the ballot markup model) into the generated empty CVR. More to come.

Click to see the "empty" CVR example
{
  "CVR": [
    {
      "CurrentSnapshotId": "current snapshot ID unknown!",
      "CVRSnapshot": [
        {
          "@id": "snapshot ID unknown!",
          "CVRContest": [
            {
              "ContestId": "contest-orbit-city-mayor",
              "CVRContestSelection": [
                {
                  "ContestSelectionId": "selection_spacely",
                  "SelectionPosition": [
                    {
                      "HasIndication": "no",
                      "IsAllocable": "yes",
                      "NumberVotes": 0,
                      "@type": "CVR.SelectionPosition"
                    }
                  ],
                  "TotalNumberVotes": 0,
                  "@type": "CVR.CVRContestSelection"
                },
                {
                  "ContestSelectionId": "selection_cogswell",
                  "SelectionPosition": [
                    {
                      "HasIndication": "no",
                      "IsAllocable": "yes",
                      "NumberVotes": 0,
                      "@type": "CVR.SelectionPosition"
                    }
                  ],
                  "TotalNumberVotes": 0,
                  "@type": "CVR.CVRContestSelection"
                },
                {
                  "ContestSelectionId": "selection_writein_1",
                  "SelectionPosition": [
                    {
                      "HasIndication": "no",
                      "IsAllocable": "yes",
                      "NumberVotes": 0,
                      "@type": "CVR.SelectionPosition"
                    }
                  ],
                  "TotalNumberVotes": 0,
                  "@type": "CVR.CVRContestSelection"
                }
              ],
              "@type": "CVR.CVRContest"
            },
            {
              "ContestId": "contest-spaceport-control-board",
              "CVRContestSelection": [
                {
                  "ContestSelectionId": "selection_jetson",
                  "SelectionPosition": [
                    {
                      "HasIndication": "no",
                      "IsAllocable": "yes",
                      "NumberVotes": 0,
                      "@type": "CVR.SelectionPosition"
                    }
                  ],
                  "TotalNumberVotes": 0,
                  "@type": "CVR.CVRContestSelection"
                },
                {
                  "ContestSelectionId": "selection_ellis",
                  "SelectionPosition": [
                    {
                      "HasIndication": "no",
                      "IsAllocable": "yes",
                      "NumberVotes": 0,
                      "@type": "CVR.SelectionPosition"
                    }
                  ],
                  "TotalNumberVotes": 0,
                  "@type": "CVR.CVRContestSelection"
                },
                {
                  "ContestSelectionId": "selection_indexer",
                  "SelectionPosition": [
                    {
                      "HasIndication": "no",
                      "IsAllocable": "yes",
                      "NumberVotes": 0,
                      "@type": "CVR.SelectionPosition"
                    }
                  ],
                  "TotalNumberVotes": 0,
                  "@type": "CVR.CVRContestSelection"
                },
                {
                  "ContestSelectionId": "selection_writein_2",
                  "SelectionPosition": [
                    {
                      "HasIndication": "no",
                      "IsAllocable": "yes",
                      "NumberVotes": 0,
                      "@type": "CVR.SelectionPosition"
                    }
                  ],
                  "TotalNumberVotes": 0,
                  "@type": "CVR.CVRContestSelection"
                },
                {
                  "ContestSelectionId": "selection_writein_3",
                  "SelectionPosition": [
                    {
                      "HasIndication": "no",
                      "IsAllocable": "yes",
                      "NumberVotes": 0,
                      "@type": "CVR.SelectionPosition"
                    }
                  ],
                  "TotalNumberVotes": 0,
                  "@type": "CVR.CVRContestSelection"
                }
              ],
              "@type": "CVR.CVRContest"
            },
            {
              "ContestId": "ballotmeasure-1",
              "CVRContestSelection": [
                {
                  "ContestSelectionId": "ballotmeasure-1_yes",
                  "SelectionPosition": [
                    {
                      "HasIndication": "no",
                      "IsAllocable": "yes",
                      "NumberVotes": 0,
                      "@type": "CVR.SelectionPosition"
                    }
                  ],
                  "TotalNumberVotes": 0,
                  "@type": "CVR.CVRContestSelection"
                },
                {
                  "ContestSelectionId": "ballotmeasure-1_no",
                  "SelectionPosition": [
                    {
                      "HasIndication": "no",
                      "IsAllocable": "yes",
                      "NumberVotes": 0,
                      "@type": "CVR.SelectionPosition"
                    }
                  ],
                  "TotalNumberVotes": 0,
                  "@type": "CVR.CVRContestSelection"
                }
              ],
              "@type": "CVR.CVRContest"
            }
          ],
          "Type": "original",
          "@type": "CVR.CVRSnapshot"
        }
      ],
      "ElectionId": "election ID unknown!",
      "SequenceNumber": "sequence number unknown!",
      "@type": "CVR.CVR"
    }
  ],
  "Election": [
    {
      "@id": "election ID unknown!",
      "Candidate": [
        {
          "@id": "candidate_spacely",
          "Code": [
            {
              "Type": "candidate code type unknown!",
              "Value": "candidate_spacely",
              "@type": "CVR.Code"
            }
          ],
          "@type": "CVR.Candidate"
        },
        {
          "@id": "candidate_cogswell",
          "Code": [
            {
              "Type": "candidate code type unknown!",
              "Value": "candidate_cogswell",
              "@type": "CVR.Code"
            }
          ],
          "@type": "CVR.Candidate"
        },
        {
          "@id": "candidate_jetson",
          "Code": [
            {
              "Type": "candidate code type unknown!",
              "Value": "candidate_jetson",
              "@type": "CVR.Code"
            }
          ],
          "@type": "CVR.Candidate"
        },
        {
          "@id": "candidate_ellis",
          "Code": [
            {
              "Type": "candidate code type unknown!",
              "Value": "candidate_ellis",
              "@type": "CVR.Code"
            }
          ],
          "@type": "CVR.Candidate"
        },
        {
          "@id": "candidate_indexer",
          "Code": [
            {
              "Type": "candidate code type unknown!",
              "Value": "candidate_indexer",
              "@type": "CVR.Code"
            }
          ],
          "@type": "CVR.Candidate"
        }
      ],
      "Code": [
        {
          "Type": "election code type unknown!",
          "Value": "election ID unknown!",
          "@type": "CVR.Code"
        }
      ],
      "Contest": [
        {
          "@id": "contest-orbit-city-mayor",
          "Code": [
            {
              "Type": "contest code type unknown!",
              "Value": "contest-orbit-city-mayor",
              "@type": "CVR.Code"
            }
          ],
          "ContestSelection": [
            {
              "@id": "selection_spacely",
              "CandidateIds": ["candidate_spacely"],
              "@type": "CVR.CandidateSelection"
            },
            {
              "@id": "selection_cogswell",
              "CandidateIds": ["candidate_cogswell"],
              "@type": "CVR.CandidateSelection"
            },
            {
              "@id": "selection_writein_1",
              "CandidateIds": [],
              "@type": "CVR.CandidateSelection"
            }
          ],
          "@type": "CVR.CandidateContest"
        },
        {
          "@id": "contest-spaceport-control-board",
          "Code": [
            {
              "Type": "contest code type unknown!",
              "Value": "contest-spaceport-control-board",
              "@type": "CVR.Code"
            }
          ],
          "ContestSelection": [
            {
              "@id": "selection_jetson",
              "CandidateIds": ["candidate_jetson"],
              "@type": "CVR.CandidateSelection"
            },
            {
              "@id": "selection_ellis",
              "CandidateIds": ["candidate_ellis"],
              "@type": "CVR.CandidateSelection"
            },
            {
              "@id": "selection_indexer",
              "CandidateIds": ["candidate_indexer"],
              "@type": "CVR.CandidateSelection"
            },
            {
              "@id": "selection_writein_2",
              "CandidateIds": [],
              "@type": "CVR.CandidateSelection"
            },
            {
              "@id": "selection_writein_3",
              "CandidateIds": [],
              "@type": "CVR.CandidateSelection"
            }
          ],
          "@type": "CVR.CandidateContest"
        },
        {
          "@id": "ballotmeasure-1",
          "Code": [
            {
              "Type": "contest code type unknown!",
              "Value": "ballotmeasure-1",
              "@type": "CVR.Code"
            }
          ],
          "ContestSelection": [],
          "@type": "CVR.CandidateContest"
        }
      ],
      "@type": "CVR.Election"
    }
  ],
  "GeneratedDate": "2062-01-01T12:00:00-08:00",
  "GpUnit": [
    {
      "@id": "state_farallon",
      "Type": "state",
      "@type": "CVR.GpUnit"
    },
    {
      "@id": "county_gadget",
      "Type": "county",
      "@type": "CVR.GpUnit"
    },
    {
      "@id": "district_orbit_city",
      "Type": "city",
      "@type": "CVR.GpUnit"
    },
    {
      "@id": "district_aldrin_spaceport",
      "Type": "city",
      "@type": "CVR.GpUnit"
    },
    {
      "@id": "precinct_1_downtown",
      "Type": "precinct",
      "@type": "CVR.GpUnit"
    },
    {
      "@id": "precinct_2_spacetown",
      "Type": "precinct",
      "@type": "CVR.GpUnit"
    },
    {
      "@id": "precinct_3_spaceport",
      "Type": "precinct",
      "@type": "CVR.GpUnit"
    },
    {
      "@id": "precinct_4_bedrock",
      "Type": "precinct",
      "@type": "CVR.GpUnit"
    }
  ],
  "ReportGeneratingDeviceIds": ["election ID unknown!"],
  "ReportingDevice": [
    {
      "@id": "reporting device id unknown!",
      "Application": "Trust the Vote application",
      "@type": "CVR.ReportingDevice"
    }
  ],
  "Version": "1.0.0"
}
cwulfman commented 2 years ago

In the top-level CVR array, what is the SequenceNumber used for? It's not listed in the spec. The value does not appear to be sequential or even unique. Where do we get this from the EDF?

from the spec documentation (p. 67):

Each CVR element also includes an optional sequence number (SequenceNumber); this isn’t required but could be helpful to auditors.

This is, then, for the case in which a CastVoteRecordReport contains many CVRs. Not applicable in our case.

cwulfman commented 2 years ago

Will the top-level ReportGeneratingDeviceIds always have exactly one element, and will that element correspond to the election ID?

I don't think we've ever talked about what a reporting device is in our context.

ReportingDevice is used to specify a voting device as the “political geography” at hand. CastVoteRecordReport refers to it as ReportGeneratingDevice and uses it to specify the device that created the CVR report. CVR refers to it as CreatingDevice to specify the device that created the CVRs.

I defer to @trustthevote to answer this

cwulfman commented 2 years ago

Where do we get the election ID from the EDF?

Remember that in XML, ID/IDRef attributes are local to the document. The CastVoteRecordReport requires at least one Election element, and the Election element requires a value for the ObjectId attribute (the value must be an xsd:identifier). The CVRR can contain more than one Election element, so the ElectionId/ObjectId pairing serves to link a CVR to the election to which it applies. Within the context of the CVRR, that ObjectId can be an arbitrary, document-specific identifier, which your application, presumably, would generate.

NIST-1500-100 defines an optional ExternalIdentifier property for its Election element, but it isn't clear how that identifier, if it exists, should be used in the CVRR: perhaps at the value of the Code element.

This is another place where these two schemas, while related, do not dovetail precisely.

cwulfman commented 2 years ago

In the top-level CVR array, the CurrentSnapshotId corresponds to the single CVRSnapshot element's @id value. Where does this value come from in the EDF?

It doesn't come from the EDF. As in the ElectionIdentifier discussion above, these are ID/IDRef pairs, local to the scope of the XML document.

This is also an example, by the way, of a fundamental difference between XML and JSON: there is no JSON equivalent of ID-IDREF in JSON.

cwulfman commented 2 years ago

Per the spec, the CVRSnapshot's Type can be either original (As scanned, no contest rules applied), modified (After contest rules applied), or interpreted (Has been adjudicated). Where do I get this information from the EDF?

You might want to review the background information in the NIST-1500-103 specification; it talks about the motivation for the CVR format. The short answer is that there is nothing in the EDF that corresponds with CVRSnapshots or their types; you should always make this type's value original because the value coming from your app is equivalent to what would come out of a scanner.

cwulfman commented 2 years ago

All Codes in the example have a Type of local-level. Per the spec, there are several enumerated values for this. In a given election, would all Types be the same like in the example? Where do we get this information from in the EDF?

I don't think we're making use of Codes, at least not at this point. These seem to be something added to the specification to make it conformant with other formats.

cwulfman commented 2 years ago

The example's Contest has a Type of CVR.CandidateContest. The spec allows for both PartyContest and BallotMeasureContest. I assume we do not need to know about party contests for now. Given that we have an EDF which contains ballot measures, can we get an example CVR which contains a ballot measure contest?

All four CVRRs in the test cases directory include a ballot-measure contest.

mattbeattie commented 2 years ago

Thanks a ton, Cliff. With your help, I've solved all my questions about CVRs (quite a feat!) except for the one about reporting devices.

For simplicity's sake, I'll go ahead and close this (very large) issue out, and we can sync up about reporting devices later on.

Thanks again!