usnistgov / ElectionResultsReporting

Common data format specification for election results reporting data
https://pages.nist.gov/ElectionResultsReporting
Other
23 stars 8 forks source link

No way to identify the jurisdiction for an office #3

Open carl3 opened 7 years ago

carl3 commented 7 years ago

Organization Name: Carl Hage

Organization Type: 4

Document: SP1500-100 ElectionResultsReporting

Reference: 4.2.16 the Office Element

Comment: The <ElectoralDistrictId> for an office relates the geographic area for eligible voters, which may have nothing to do with the jurisdiction (Organization, Public Agency, District). Some jurisdictions use voting districts defined by another jurisdiction for voting. For example, the "Los Angeles County Democratic Party County Central Committee" might use state assembly districts or county supervisorial districts. A "County Office of Education", representing a set of school district, some of which may cross into another county. All voters in the primary county (county boundary) might vote for that office as well as voters in an adjacent county for school districts represented.

There needs to be a reference to a GpUnit representing the organization (Jurisdiction or Public Agency) at large, independent of the election district specific to a particular seat for an office.

The Name for an office is often abbreviated, may not be unique, and often spelled in different ways on different reports and files. It cannot easily be used to identify the name of the school district, etc. without complex manually-configured locality-specific pattern matching.

Likewise, and office name is typically written in so many ways that a machine cannot identify the title for the elected office and area/seat name.

Suggested Change: Add <JurisdictionDistrictId> to reference the GpUnit that represents the organization that is the public agency or jurisdiction for the elected office. If omitted, it is assumed to be the same as <ElectoralDistrictId>.

Add <OfficeName> to represent the title of the elected office, separate from the jurisdiction name, district area (partition of the jurisdiction), or a seat name, e.g. "Council Member".

The <Name> in the referenced <JurisdictionDistrictId> GpUnit must be the jurisdiction name.

Add <AreaName> to represent the area of a partitioned jurisdiction (Council District, Trustee Area, etc.). For offices elected at-large, the <ElectoralDistrictId> is the whole jurisdiction, but the office might represent Council District 3. For offices elected by-district, the <ElectoralDistrictId> name would be the combination of the <JurisdictionDistrictId> name and <AreaName>.

Add <SeatName> to represent an identification of a seat for a group office (council, court, etc.) not related to a geographic area. Sometimes a term start date is used to identify a seat (e.g. Ohio Judges). In some areas, the seat is identified by the name of the incumbent.

The separate OfficeName, JurisdictionDistrictId.Name, AreaName, and SeatName form parsed machine-readable names that can identify the office.

ContactInformation can be associated with a particular office, and also independently for the jurisdiction to which it belongs in <JurisdictionDistrictId>.

jdmgoogle commented 7 years ago

I think there are some good suggestions buried in here, but there are several issues which are conflated in this report that I'd like to see separated out for consideration.

The higher-level point of the electoral district versus the jurisdiction is a good one (the obvious example case is a congressional district versus the jurisdiction of Congress as a whole). In fact this is something we started discussing over at VIP: votinginfoproject/vip-specification#350

We also discussed creating an OfficeBody element (votinginfoproject/vip-specification#42) which is, I believe, something which this proposal attempts to address. Ultimately we decided not to go down that path, but that may be something to revisit/spin off into a separate GitHub issue.

I'm not entirely sure what the AreaName proposal is about, so I can't provide any feedback on that.

I endorse the idea of an (optional) SeatName element, since that's something we ran into in 2016 (e.g., multiple at-large seats up for election in a single cycle because an officeholder passed away/moved/resigned ahead of their term expiration).

carl3 commented 7 years ago

To elaborate, the issue is converting a contest/candidate list into something machine readable. Typically, contests have an abbreviated free-form title (it's nice to see the <Contest> has abbreviated name, name, and ballot title/subtitle). But to identify that contest, the abbreviated name needs to be unabbreviated and parsed into separate elements that identify the contest:

  1. The name of the jurisdiction, e.g. "City of Sunnyvale".
  2. The title of the office, e.g. "Council Member".
  3. The "district" for geographically partitioned jurisdiction, what I meant by AreaName (please suggest better name), e.g. "Council District B".
  4. A seat name that identifies a position other than geography, e.g. "Seat 1", "Office 10".

The office name, i.e. one that can be used to uniquely distinguish an office is

The above applies to an office. A contest might also add term info, e.g. "Unexpired Short Term" or "Full Term".

When an office with a district (area portion) is elected at large, the area could be represented as a "Seat", however, I think it is better to identify the area by name (to allow the GpUnit for the district to be identified for candidate filing). There are cases where there is a primary elected by-district and runoff elected at-large.

jdmgoogle commented 7 years ago

Parsing names in order to get structured information is a Sisyphean task. As someone who implemented some of the logic within Google to handle magic values in fields or expecting text strings to have certain formats, all I can say is: here be dragons. Don't go there.

carl3 commented 7 years ago

I've done my share of Sisyphean parsing of abbreviated contest titles. That's the point-- there should be a way to (optionally) represent the parsed composition of a contest as it is originally stored in an EMS. By redacting it and removing the possibility to define the office in a machine readable way, you force the reading app into the (currently usual) complex parsing of abbreviated office titles. I'm not saying a translator should do Sisyphean parsing-- there just needs to be a way to preserve content, or worst case, EA-defined configuration options in the xml writer.

jdmgoogle commented 7 years ago

The point I was trying to make is that the moment you're trying to parse anything that's a Name field (broken-down or not) you've already lost. E.g., here are some ways that "City of Sunnyvale" could be represented:

  1. City of Sunnyvale
  2. Sunnyvale
  3. SUNNYVALE
  4. Sunnyvale, CA
  5. Sunnyvale (CA)

There are similarly endless ways which the other names could be represented.

Parsing free-form strings into enums/unique entities: Not Even Once.

johnpwack commented 7 years ago

I am in favor of doing the following:

I'm not sure of adding an area name either - it's not something that would have votes associated with it. So, I'm wondering, what is the value in having it?

JDziurlaj commented 4 years ago

This tracks with some suggested changes from Google. We should revisit this when there is time.