everypolitician / democratic-commons-tasks

An issues-only repository for tracking work for the Democratic Commons project
0 stars 0 forks source link

create a first version of the verification-pages tool #42

Open mhl opened 6 years ago

mhl commented 6 years ago

(A lot of this is off the top of my head, and definitely not prescriptive - I'm trying to sketch out how this might work, so comments and suggestions are very welcome, and I'll edit this in response...)

Background

A "verification page" is a wiki page on wikidata.org, generated by a script we will write, that looks like:

verification-page-full-scratchpaad

The section at the top "New statements to verify" is the bit to focus on. Each of these rows is making an assertion about the position held by a politician - if that statement is verifiable (according to the quoted URL), then clicking yes would cause that statement to be created (or updated) in Wikidata by some client-side Javascript. This should be an efficient way of going through a lot of crowd-sourced for suggestions of who the politicians representing particular areas are, and adding the correct suggestions to Wikidata.

When either "Yes" or "No" is clicked, the client-side Javascript should record that decision by POSTing to a service to indicate that decision.

The next section is for statements that (for the moment, at least) can't be automatically applied to Wikidata if they're verified. For example, a case like this might be that the suggestion indicates that a politician has changed party, when they're already recorded as representing a different party in the current term in Wikidata. The correct thing to do in that case would be to set an end time on the existing P39 statement, and creating a new one with an appropriate start time and the new party name.

The third section is for statements that have already either been verified, or couldn't be verified from the suggested URL.

We have already got most of the client-side Javascript and Wikimedia templates to make such a page work once it exists. However, we still don't have the following key components:

I'd suggest that this is a Rails app with a rake task to regenerate the page, but it could perfectly well be done in any number of other ways. (Note that we'll probably want to allow someone to manually trigger regeneration of any verification page via clicking a button on the verification page even if someone doesn't have the Javascript extension installed, so it'll need to be triggerable by a GET request to the web app too - this is roughly how the "Refresh now" button works on https://www.wikidata.org/wiki/User:Mhl20/Prime_minister_test for example.)

This web app and database should be hosted on Wikimedia's tool server, as we've done with the prompter and position holder history tool, although it could be developed locally.

We're assuming that there's a verification page per legislature in the first place. (And probably a page per executive for executive positions, but we'll get this working for legislatures first.)

Prompt generation rake task

This needs to fetch from the database a set of statements for verification associated with the legistaure, regardless of whether they've had a decision made about them, since we want to show recent decisions as well as those that need a decision.

Each actionable statement needs the following associated with it:

The Wikitext that should be created for actionable statements will look like:

{{/table_row_actionable|statement=EXAMPLEUUID|subject=Q7148353|baserevid=19284592457|property=P39|object=Q45308607|qualifier_P768=Q12626632|qualifier_P4100=|qualifier_P2937=EXAMPLETERM|reference=http://www.assembly.nu.ca/members/mla}}

(where "subject" is the person's Wikidata item ID, from "subject verb object" terminology for a triple)

There will be some statements where we cannot easily apply the statement: this will be due to there being an existing statement which we can't update safely. These should be output as wikitext like:

{{/table_row_manual|statement=EXAMPLEUUID|subject=Q7148353|property=P39|object=Q45308607|qualifier_P768=Q12626632|qualifier_P4100=|qualifier_P2937=EXAMPLETERM|reference=http://www.assembly.nu.ca/members/mla|why_not_actionable=This suggest the person switched parties midterm; you'll have to verify and fix this manually if it's correct}}

Database model

I'm hestitant about prescribing a particular database model, because I'm sure there'd be silly mistakes, and also I'm not sure how general we should attempt to make it: potentially this could be used for adding Wikidata statements for any property based on information that needs verifying, not just P39s for political positions, and the qualifiers could be anything. But so long as we're using a framework with support for database migrations, I think that's something we could move towards once this first use case of political positions is working nicely.

However, my starting suggestion is that I think we'll want to probably want to have these tables:

A table where each row represents a verification page, with columns for the following:

A table with statements for verification, which includes columns for:

A table with verification results:

Web app

This needs to have endpoints that:

Script to populate statements for verification from the suggestions-store

The statements for verification will be sourced from the suggestions-store component that @chrismytton has been working on; this should have an endpoint that will return suggestions, at least including:

We have an id-mapping-store component which should be used for looking up the Wikidata IDs for the person, party and area ID. The area ID mapping should always be present, but there may be no Wikidata ID for the person or party if they haven't been found yet via Mix 'n' Match games. My suggestion for what this script should do in cases where the Wikidata ID can't be found is that it shouldn't create rows in the database for such statements, but instead update a count (perhaps stored as a column on the "page" table of items that couldn't be processed because of missing Wikidata IDs - if those counts are non-zero, we could include on the verification page a link to the appropriate Mix 'n' Match game.

Questions

Is there any way we can secure the endpoint that accepts the POST requests with the results of decisions, in order to stop someone from just recording "no" for all statements, or random decisions? (I can see how to make that more awkward, but not how to stop it completely.) This kind of abuse of the system isn't such a problem in terms of the edits made to Wikidata, since that can be handled by Wikidata moderation and reversion tools, but we need to consider potential vandalism of this API separatly.

Future enhancements

It would be great if this component also generated a dashboard or summary page, linking to the verification pages that need most work, indicating progress and with a leaderboard for the users who have contributed most.

As discussed above, we'd want to keep in mind that we may want to make this tool more generic, in the sense of supporting statements other than P39 and a configurable set of qualifiers.

mhl commented 6 years ago

One thing which that description didn't go into in any detail is when we can update an existing P39, when there's an existing one that needs to be updated manually and when to create a completely new one.

I suggest, following on from a conversation with @tmtmtmtm in Slack, that a pseudocode-ish expression of what we want to do might look like:

If all of the existing P39 statements for the person are classified as "ignore" by the above => actionable (create)