opencivicdata / docs.opencivicdata.org

Open Civic Data project documentation
https://open-civic-data.readthedocs.io
44 stars 33 forks source link

document differences between our Vote and Popolo Vote #15

Closed jamesturk closed 10 years ago

paultag commented 10 years ago

Stuff I noticed going through the two implementations:

Vote (the vote.votes object)
============================

    New stuff (from Popolo), not yet in OCD Models
      + political group
      + role (tellers, etc)
      + weight (party votes, some people have 1.1x votes, etc)
      + vote pairings

    Questions (from paultag)
      + Vote.group_id => Political party. We can get the Person's party
        via person.orgs w/classification:party, right now. Should be defined
        as the voting bloc's ID, a person can be a member of more than one
        party.
      + Is there an established grammer on role? It'd be annoying to get
        freeform text back - teller vs Teller vs TELLER
      + weights can be non-integers (worked this one out with granicus, they
        have some places where the chair gets a weight of 1.1, or a non-council
        official having half a vote). I think we can safely assume that they're
        positive, though.

VoteEvent (the vote object)
===========================

    OCD Missing
      + Top-level Motions[see Motions section]

    Stuff shoved into Motion from the top level (as we have it):
      + Related bill
      + Related org
      + Outcome (called "result" in Popolo)
      + Session (context in Popolo)

Motions
=======

This was split into its own top level object.

 1: What is object_id ? There's no type of related entity there; looks like
    it's pointing to a bill, but it's not bill_id. Can it point at nonbill?
    If so, why not bill_id? If not, why is there no type?

 2: is there a strict grammer on requirement? Seems like we'd want a common
    way to talk about the voting requirements for automatic passage
    detection / automatically calculating minimum votes.

 3: No classification (we store in top-level VoteEvent now).
    Examples that we had in Open States:
      - passage
      - amendment
      - reading:1
      - reading:2
      - other
jamesturk commented 10 years ago

Vote

VoteEvent

Motion

jamesturk commented 10 years ago

(tagging @jpmckinney) this is where we're discussing our differences, I think the big thing for us is the non-use of Motion w/ those fields squashed onto VoteEvent https://github.com/opencivicdata/python-opencivicdata/blob/master/opencivicdata/models/vote.py

jpmckinney commented 10 years ago

Vote.group_id => Political party. We can get the Person's party via person.orgs w/classification:party, right now. Should be defined as the voting bloc's ID, a person can be a member of more than one party.

Are you suggesting a change to the definition of group, currently "The voter's primary political group"? I agree there can be more clarity.

Is there an established grammer on role? It'd be annoying to get freeform text back - teller vs Teller vs TELLER

It is currently free-form, but we can add an open-ended code list, once we agree on possible values.

weights can be non-integers (worked this one out with granicus, they have some places where the chair gets a weight of 1.1, or a non-council official having half a vote). I think we can safely assume that they're positive, though.

Noted, the JSON Schema now uses number. Thanks for finding this!

Stuff shoved into Motion from the top level (as we have it):

  • Related org
  • Outcome (called "result" in Popolo)
  • Session (context in Popolo)

I've added VoteEvent#result, which is needed when a motion has multiple events. I added organization and context, because it's reasonable to make statements about a vote event's organization and session. Can you use the terms result, organization and context?

  • Related bill

Strictly speaking, vote events are not related to bills. The motion being voted on is related to the bill. Could you have a JSON representation of a vote event like:

{
  ...
  "motion": {
    "text": "That Bill C-23 be read a second time.",
    "object_id": "a-unique-identifier-for-the-related-bill"
  },
  ...
}

1: What is object_id ? There's no type of related entity there; looks like it's pointing to a bill, but it's not bill_id. Can it point at nonbill? If so, why not bill_id? If not, why is there no type?

It's open-ended, but a motion can be about a bill (second reading), about another motion (it happens), about an amendment to a bill (if an implementation specifically models amendments), etc. Popolo doesn't model those, so there's no specific range given for the property. What would make the documentation clearer on this point?

2: is there a strict grammer on requirement? Seems like we'd want a common way to talk about the voting requirements for automatic passage detection / automatically calculating minimum votes.

This is certainly something we can work on. Do you have a list to start things off with?

3: No classification (we store in top-level VoteEvent now).

Strictly speaking, the examples given are classifications of motions (motions of passage, etc.), not of vote events. Would it be possible to put it on a motion subdocument as above? I've added this property to the Motion class.

Lastly, you currently have votes which themselves have votes (vote.votes). This is a little confusing, which is why Popolo has "vote events" and "votes". Would it be possible to use "vote events"?

paultag commented 10 years ago

Thanks for the quick reply, James!

Vote.group_id => Political party. We can get the Person's party via person.orgs w/classification:party, right now. Should be defined as the voting bloc's ID, a person can be a member of more than one party.

Are you suggesting a change to the definition of group, currently "The voter's primary political group"? I agree there can be more clarity.

I'm going to delay my response on this, there's a bit of research I have to do quickly to pull in some facts, rather then work off my memory of the situation here.

Is there an established grammer on role? It'd be annoying to get freeform text back - teller vs Teller vs TELLER

It is currently free-form, but we can add an open-ended code list, once we agree on possible values.

Cool. That sounds good.

weights can be non-integers (worked this one out with granicus, they have some places where the chair gets a weight of 1.1, or a non-council official having half a vote). I think we can safely assume that they're positive, though.

Noted, the JSON Schema now uses number. Thanks for finding this!

Sure :)

Stuff shoved into Motion from the top level (as we have it):

  • Related org
  • Outcome (called "result" in Popolo)
  • Session (context in Popolo)

I've added VoteEvent#result, which is needed when a motion has multiple events. I added organization and context, because it's reasonable to make statements about a vote event's organization and session. Can you use the terms result, organization and context?

We currently have bill.session which is related to a session (pulled from jurisdiction.legislative_sessions, and put into a JurisdictionSession in the database) which is pretty similar to context. I don't know about the rest of the team, but the term context seems really confusing to me, and I love this data :). I'm not even sure what should be here - it looks like it's an Object, with some stuff that's hardcoded for a parliamentary system.

  • Related bill

Strictly speaking, vote events are not related to bills. The motion being voted on is related to the bill. Could you have a JSON representation of a vote event like:

Yeah, sure.

{
  ...
  "motion": {
    "text": "That Bill C-23 be read a second time.",
    "object_id": "a-unique-identifier-for-the-related-bill"
  },
  ...
}

Right.

1: What is object_id ? There's no type of related entity there; looks like it's pointing to a bill, but it's not bill_id. Can it point at nonbill? If so, why not bill_id? If not, why is there no type?

It's open-ended, but a motion can be about a bill (second reading), about another motion (it happens), about an amendment to a bill (if an implementation specifically models amendments), etc. Popolo doesn't model those, so there's no specific range given for the property. What would make the documentation clearer on this point?

Yeah, that'd be handy.

About the name itself, basically, there are a few pain-points here:

I'd be pretty keen on seeing something like bill_id or vote_event_id in our implementation, since it's a lot more explicit about what you're getting, and it means we can set up a proper relation in the database.

At the least, something like:

{
  ...
  "motion": {
    "text": "That Bill C-23 be read a second time.",
    "related_object_id": "a-unique-identifier-for-the-related-bill",
    "type": "bill"
  },
  ...
}

in the serialization (basically, I care about that type more than the object_id name)

2: is there a strict grammer on requirement? Seems like we'd want a common way to talk about the voting requirements for automatic passage detection / automatically calculating minimum votes.

This is certainly something we can work on. Do you have a list to start things off with?

Not yet, but it caught my eye. Hashing one out would be really useful in the long-term.

3: No classification (we store in top-level VoteEvent now).

Strictly speaking, the examples given are classifications of motions (motions of passage, etc.), not of vote events. Would it be possible to put it on a motion subdocument as above? I've added this property to the Motion class.

Cool.

Lastly, you currently have votes which themselves have votes (vote.votes). This is a little confusing, which is why Popolo has "vote events" and "votes". Would it be possible to use "vote events"?

I mean, I understand why there are two terms here, but I think (dispite it being confusing), the term vote is actually used in both ways.

You can both hold a "vote", and cast your "vote". We currently have an Event type, so having a VoteEvent type sounds a bit confusing to me, and would make me think they should be merged (but really express totally different things)

I think we use the term VoteEvent in the DB models, but I do think the term VoteEvent could be really confusing to people. I think most people would understand the difference between vote and vote.votes once they look at the data.

Anyway, I'm totally not opposed to any of this, just want to make sure it's all done for the right reasons :)

jpmckinney commented 10 years ago

We currently have bill.session which is related to a session (pulled from jurisdiction.legislative_sessions, and put into a JurisdictionSession in the database) which is pretty similar to context. I don't know about the rest of the team, but the term context seems really confusing to me, and I love this data :). I'm not even sure what should be here - it looks like it's an Object, with some stuff that's hardcoded for a parliamentary system.

context is free-form. I've now copied the explanation from the Motion docs into the VoteEvent docs: "The range of the context property is not specified, as it varies greatly across jurisdictions." The example is parliamentary, but you can do whatever, as long as it's an object. I chose context because I am not sure that session is international. I also know that some jurisdictions get more precise about the legislative context. Once Popolo implements a generic Event class, context will be an Event, which can be a sitting, a term, a session, etc.

One option here is for me to validate whether session is sufficiently international, and if so to implement it to refer specifically to sessions, and then decide whether to eliminate context or keep it with a more specific meaning.

... object_id ...

object_id was a mistake. member is another polymorphic property, and I omitted member_id for that reason. A motion would look like:

{
  ...
  "object": {
    "@type": "Motion",
    "id": "of-another-motion-that-this-motion-is-amending"
  },
  ...
}

How's that? We have @type because, when the JSON-LD contexts are applied to the JSON, interpreters understand that the object is an instance of the Motion class. If the @ is an issue, we can try to find something that works for everyone. Another option is to omit the object property and have each implementation choose its terms, like bill_id, amendment_id, or whatever others are supported.

You can both hold a "vote", and cast your "vote".

Yes, the word "vote" is used in both ways - but Popolo can't have two classes with different semantics with the same name :) I understand that JSON can get away with it, though.

We currently have an Event type, so having a VoteEvent type sounds a bit confusing to me, and would make me think they should be merged (but really express totally different things)

In Popolo, VoteEvent will be a subclass of the eventual generic Event class. OCD's Event class is more specific; it's more like a Meeting than a generic Event. Would it be crazy to rename it to Meeting? Existing government monitoring systems model other events like elections, which have an impact on the end dates of memberships; legislative sessions, which scope bill and vote identifiers, etc. I think there's a benefit to having some common logic for all events, instead of building each as if they had nothing in common.

I think we use the term VoteEvent in the DB models

The JSON output is more important to Popolo than the DB, because JSON is where people will integrate with the system. The DB is rarely a concern, because it's not expected for third-parties to integrate at that level; interoperability at that level is much less likely to be achieved.

I do think the term VoteEvent could be really confusing to people

Does anything link to vote events? Since OCD doesn't have motions, and since individual votes are embedded in vote events, there's no place for the vote_event, vote_event_id or vote_events properties to be used. And OCD only declares class names in a custom _type field, so that's not an issue either. So, on the JSON layer, I think use of "Vote" instead of "VoteEvent" is conformant, but there's a risk it can lead to conformance issues later as OCD grows in scope.

My homework:

  1. Check the universality of session
  2. Consider the impacts of removing context and object

I think those are the two points preventing Popolo adoption. Correct?

jamesturk commented 10 years ago

OCDEP 7 is now up http://docs.opencivicdata.org/en/latest/proposals/0007.html

jamesturk commented 10 years ago

The main change from what we had is the decision that we can move

"motion": "motion goes here"

to

"motion": {"text": "motion goes here"}

without any real trouble, it'll be an API-level fix for now

jpmckinney commented 10 years ago

Cool, I think classification should be under the motion object as well, since "passage" etc. classifies the motion, not the event.

jpmckinney commented 10 years ago

Ref. #26