Best way to model a Candidacy

tmtmtmtm commented 9 years ago

There's been some discussion around how to best model election candidates for YourNextRepresentative as it expands to cope with a variety of different election systems. I'm opening this issue to give us place to consolidate that discussion, and hopefully arrive at a consensus.

One stumbling point in the discussion to date seems to have been how best to track what Party someone is representing. Much of this seems to have paralleled the discussions around how to track the Party of an elected legislator, so I believe the solution will probably be very similar.

I can imagine two scenarios that would make a simple "look up what Organisation (with a classification of Party) this Person has an active Membership in" query problematic (on top of the normal issues around this sort of indirection):

someone who is a currently MP for one party, but is standing for a different party in the upcoming election
someone who is standing simultaneously for two different parties

I suspect neither of these are particularly common, but I've seen both happen.

I would propose, therefore, that the best approach is to parallel the current consensus around what the membership would look like post-election, and mirror it as much as possible into the pre-election model. In particular this would involve a Membership of either the Post (in a constituency-based model using those), or the Legislature as a whole (possibly with an Area attached), using the on_behalf_of property, the legislative_period property, and setting a role of "candidate" rather than "member".

So, for example, Fred Bloggs standing for the People's Party in the 2015 election would be:

Membership:

{
    person_id: "fred_bloggs",
    post_id: "constituency/southern_heights"    (or organization_id) 
    role: "candidate",
    on_behalf_of_id: "peoples_party",
    legislative_period: "term/2015",
}

(with an optional start_date for when the candidacy was declared, and end_date if they drop out before the election)

And then, if he were elected, an almost identical separate membership:

{
    person_id: "fred_bloggs",
    post_id: "constituency/southern_heights"    (or organization_id) 
    role: "member",
    on_behalf_of_id: "peoples_party",  {{1}}
    legislative_period: "term/2015",
}

Thoughts? Problems? Suggestions?

{{1}} as a slight aside, this would also enable quite a clean approach for cases where there's a strong distinction between the Party a person stands for election on behalf of, and the Parliamentary Party/Faction that their actual legislative membership would be on behalf of — and might thus also provide a nice way to model that scenario even in the absence of tracking the losing candidates.

martinszy commented 9 years ago

I'm just going to mention @lfalvarez here so he can provide his comments.

jpmckinney commented 9 years ago

Note: legislative_period not legislative_session.

For YNMP, @mhl and I decided to drop organization_id from candidate memberships, since candidates are not members of the legislature; the idea was that a naive query like:

SELECT * FROM memberships WHERE organization_id="house-of-commons"

wouldn't accidentally pick up the candidates. There would still be a tie to the organization through the post_id. However, in non-constituency-based systems, that's not an option. So, 2 options:

All systems must know to treat memberships with a role of "candidate" differently from all other memberships.
Subclass the Membership class to create a Candidacy class, which will have different semantics in terms of organizational membership.

I'm leaning towards Candidacy, since candidates are not members, and it's kind of crazy for one field's value (role="candidate") to change the semantics for the entire object.

tmtmtmtm commented 9 years ago

Note: legislative_period not legislative_session.

Oops, yes. I've fixed the original so as it's not misleading. Feel free to revert if you prefer not to rewrite history :)

tmtmtmtm commented 9 years ago

Subclass the Membership class to create a Candidacy class, which will have different semantics in terms of organizational membership

What would those differences be? In the case of a non-constituency-based system, you're still going to have to say somehow which legislative Organization they're a candidate for, and ending up with a class that is entirely identical, other than in name, would be a little crazy too! Or are there other things that would be added/removed/changed for a Candidacy?

jpmckinney commented 9 years ago

It doesn't seem crazy to me to look up candidates in one database table and members in another database table. Candidates and members are quite different in people's minds, even if they have nearly identical properties. The use cases for each, to my knowledge, are non-overlapping. Is there a question whose answer contains both candidates and members? If not, then a logical separation seems reasonable.

The main difference is that candidates are not members of the organization to which they are attached, and avoiding confusion on that difference alone makes an additional class acceptable in my mind. Candidates can also have an "incumbent" aura. It also probably makes more sense to reference a specific election than to reference a legislative period which these people will never participate in as candidates. Candidates can also have additional properties for use cases around electoral finance, though these haven't been developed in Popolo yet.

tmtmtmtm commented 9 years ago

It doesn't seem crazy to me to look up candidates in one database table and members in another database table

Sure, though I think I've missed the discussion where Popolo has become a tabular format… :)

Is there a question whose answer contains both candidates and members?

You'll often want to know which of the candidates won. It seems that that would be an easier query generally if that was on the Candidacy rather than having to run a separate query against the Memberships.

The main difference is that candidates are not members of the organization to which they are attached, and avoiding confusion on that difference alone makes an additional class acceptable in my mind

Yep — that's the main thing that's persuasive to me. On its own, however, that would imply there's a different abstraction missing, rather than that there should be two identical classes. That's why I'm trying to work out what other differences there would be.

Another would be that people commonly want to record how many votes each candidate received.

jpmckinney commented 9 years ago

Sure, though I think I've missed the discussion where Popolo has become a tabular format… :)

Popolo doesn't operate in a vacuum - it realizes that implementations do eventually occur, so it does try to avoid models that require overly complex database schemas :)

You'll often want to know which of the candidates won. It seems that that would be an easier query generally if that was on the Candidacy rather than having to run a separate query against the Memberships.

OpenElections uses a winner boolean on the candidate, so the Candidacy and Memberships models are still disjoint for that use case.

Another would be that people commonly want to record how many votes each candidate received.

Indeed. It makes more sense to tie electoral information to Candidacy objects that are in a distinct collection from the Membership objects. It seems odd to mix Candidacies and Memberships in one big bag.

tmtmtmtm commented 9 years ago

OpenElections uses a winner boolean on the candidate

Purely as a data-point to consider in all this (rather than an argument for any particular approach), it's worth noting that there are many countries where it's parties who win a certain number of seats, rather than candidates winning themselves, and also where candidates who get the most votes might not actually become the representatives.

In Estonia, for example, it's very common for well known politicians to stand for election, and receive substantially more votes than other people in their party (or indeed the most votes outright), but for them to not actually take any of the seats their party is then allocated. A simple boolean isn't really going to suffice in those sorts of cases.

jpmckinney commented 9 years ago

Yes, there is certainly more work to do if we wanted to model election results. However, I'm just showing that the election use cases can/should be solved in "elections land" and don't need "legislature land" for answers.

Floppy commented 9 years ago

I definitely agree it would be great to have some guidance on how to do this. I'm trying to represent our party's candidacies (past and future) in various elections in https://github.com/SomethingNewUK/candidates/, and not knowing how to properly represent an election (and its results) is a bit awkward :)

So far, it's modeled as a set of memberships of an organization, with role "candidate", but it doesn't seem quite right, and that leaves no way to represent the abstract election itself, rather than the organisation. For instance, on what date did the election take place?

I actually want to make this data the canonical source of information, and use it to drive our website, so it would be great to be able to represent everything directly in popolo.

tmtmtmtm commented 9 years ago

Hi @Floppy — there are still definitely unanswered questions in the modelling of the Memberships here, but one thing that doesn't seem to have been mentioned in the discussion here to date that should help with at least part of your issue, is that for modelling the Election itself (with dates etc), you should be using an Event.

Floppy commented 9 years ago

Ah, nice one, I had wondered about that. I'll do some more changes on that basis.

Floppy commented 9 years ago

So with that advice, I've updated https://github.com/SomethingNewUK/election-data (recently renamed) to use Events to model the elections themselves, including sub-events for wards or constituency votes within a larger election. The people and organization models work, but using membership for candidacy still leaves a lot to be desired.

Reading through the above, and having tried to model this in a way that seems self-consistent, I think a Candidacy model would be most useful, which can link to organization IDs for party and the thing they're being elected to. Ideally, if elections are modelled as events, there should be a way to link the event ID to the candidacy as well. It actually needs to be a join between the person, the event, the organisation they're being elected to and the party. Personally I think some of the data involved is better in the event (organisation being elected and area), but that would mean having a special Election model as well of course.

As mentioned a Candidacy model would also allow results to be included in some way, even if initially just a freeform text field so as not to assume a particular system.

I'm entirely happy to do more work on this, as like I say I want to model this stuff properly to generate our website. I'm also entirely happy to have the link above be used as an example for others trying to represent the same thing, or to help write some docs on it when it's settled.

jpmckinney commented 9 years ago

Thanks for the contributions @Floppy. My current priority is to rework the documentation to be more accessible, descriptive, and complete. Nonetheless, it will be helpful to document in this issue the possible changes or additions to be made to Popolo in order to fulfill the elections use case.

Floppy commented 9 years ago

I'm trying to use the data files I've made in anger at the moment to build our candidates page automatically, so I should work out the missing parts in that process. I'll post back here with comments (or link to blog post) when I've done that.

Floppy commented 8 years ago

A year on, I've had some more thoughts on this, and my current ideas are in https://github.com/SomethingNewUK/election-data/tree/gh-pages/candidacies and https://github.com/SomethingNewUK/election-data/tree/gh-pages/elections.

Elections are an event model, just with a classification of election, and the addition of an organization field for the body being elected to.

The candidacies are a completely new home rolled model, which pretty much just joins a Person, an Event, and an Organization (the party the person is standing for). There are optional links for things like campaign websites. This is very much like a Membership relation.

Take a look see what you think.

jpmckinney commented 7 years ago

@fgregg @palewire @jungshadow Re: https://github.com/datamade/docs.opencivicdata.org/blob/elections/proposals/drafts/elections.rst, in Popolo (and thus OpenCivicData), Elections are just Events (or can be made a sub-class of Event if they have unique properties). This thread has a discussion about Candidacy, which I need to roll up into a summary - at which point it'll be easier to compare to your proposal and VIP. (In general I'd just recommend using VIP if your use case is limited to elections, but as your use case grows outside that, it may need to connect to what Popolo provides.)

popolo-project / popolo-spec

Best way to model a Candidacy #104