(A lot of this is off the top of my head, and definitely not prescriptive - I'm trying to sketch out how this might work, so comments and suggestions are very welcome, and I'll edit this in response...)

Background

A "verification page" is a wiki page on wikidata.org, generated by a script we will write, that looks like:

verification-page-full-scratchpaad

The section at the top "New statements to verify" is the bit to focus on. Each of these rows is making an assertion about the position held by a politician - if that statement is verifiable (according to the quoted URL), then clicking yes would cause that statement to be created (or updated) in Wikidata by some client-side Javascript. This should be an efficient way of going through a lot of crowd-sourced for suggestions of who the politicians representing particular areas are, and adding the correct suggestions to Wikidata.

When either "Yes" or "No" is clicked, the client-side Javascript should record that decision by POSTing to a service to indicate that decision.

The next section is for statements that (for the moment, at least) can't be automatically applied to Wikidata if they're verified. For example, a case like this might be that the suggestion indicates that a politician has changed party, when they're already recorded as representing a different party in the current term in Wikidata. The correct thing to do in that case would be to set an end time on the existing P39 statement, and creating a new one with an appropriate start time and the new party name.

The third section is for statements that have already either been verified, or couldn't be verified from the suggested URL.

We have already got most of the client-side Javascript and Wikimedia templates to make such a page work once it exists. However, we still don't have the following key components:

The database that stores these statements for verification, and their verification status.
The script that generates such verification pages
The script that adds the statements to the database, based on those from suggestions-store.
The web app that receives the POST request, updates the verification status and triggers a regeneration of the verification page.

I'd suggest that this is a Rails app with a rake task to regenerate the page, but it could perfectly well be done in any number of other ways. (Note that we'll probably want to allow someone to manually trigger regeneration of any verification page via clicking a button on the verification page even if someone doesn't have the Javascript extension installed, so it'll need to be triggerable by a GET request to the web app too - this is roughly how the "Refresh now" button works on https://www.wikidata.org/wiki/User:Mhl20/Prime_minister_test for example.)

This web app and database should be hosted on Wikimedia's tool server, as we've done with the prompter and position holder history tool, although it could be developed locally.

We're assuming that there's a verification page per legislature in the first place. (And probably a page per executive for executive positions, but we'll get this working for legislatures first.)

Prompt generation rake task

This needs to fetch from the database a set of statements for verification associated with the legistaure, regardless of whether they've had a decision made about them, since we want to show recent decisions as well as those that need a decision.

Each actionable statement needs the following associated with it:

An ID for the statement for verification (n.b. initially most of these suggestions will be coming from crowdsourcing, with a transaction ID, but I think this system should have its own IDs, since we may well want to inject suggestions for verification from other sources in the future.)
A reference URL from which the statement could potentially be verified - initially this will be the same for each statement for verification in a legislature
The Wikidata item ID for the person
The current revision ID of the person from Wikidata (which you can get via the schema:version property in a SPARQL query)
The statement UUID for any existing P39 statement, If there's an existing statement that should be updated rather than creating a new statement
The Wikidata item ID for the position they hold (e.g. Member of the UK parliament) - this is the object of the P39 statement
The Wikidata item for the parliamentary group their position is associated with (P4001) - this may be omitted, since we might not get it as part of the crowd-sourced data
The Wikidata item for the electoral district their position is associated with (P768)
The Wikidata item for the parliamentary term this position is associated with (P2937)
The verification status - represented by an enum, but initially UNDECIDED, YES or NO

The Wikitext that should be created for actionable statements will look like:

{{/table_row_actionable|statement=EXAMPLEUUID|subject=Q7148353|baserevid=19284592457|property=P39|object=Q45308607|qualifier_P768=Q12626632|qualifier_P4100=|qualifier_P2937=EXAMPLETERM|reference=http://www.assembly.nu.ca/members/mla}}

(where "subject" is the person's Wikidata item ID, from "subject verb object" terminology for a triple)

There will be some statements where we cannot easily apply the statement: this will be due to there being an existing statement which we can't update safely. These should be output as wikitext like:

{{/table_row_manual|statement=EXAMPLEUUID|subject=Q7148353|property=P39|object=Q45308607|qualifier_P768=Q12626632|qualifier_P4100=|qualifier_P2937=EXAMPLETERM|reference=http://www.assembly.nu.ca/members/mla|why_not_actionable=This suggest the person switched parties midterm; you'll have to verify and fix this manually if it's correct}}

Database model

I'm hestitant about prescribing a particular database model, because I'm sure there'd be silly mistakes, and also I'm not sure how general we should attempt to make it: potentially this could be used for adding Wikidata statements for any property based on information that needs verifying, not just P39s for political positions, and the qualifiers could be anything. But so long as we're using a framework with support for database migrations, I think that's something we could move towards once this first use case of political positions is working nicely.

However, my starting suggestion is that I think we'll want to probably want to have these tables:

A table where each row represents a verification page, with columns for the following:

A numeric primary key
The page title on the Wiki
A membership position item
A term item
A default reference URL
A boolean representing if political parties are required for this position

A table with statements for verification, which includes columns for:

A numeric primary key (the ID for the statement for verification)
A (possibly null) transaction ID representing the crowd-sourced suggestion this came from
The Wikidata item ID for the person
The current revision ID of the person from Wikidata
The statement UUID for any existing P39 statement
The Wikidata item for the parliamentary group
The Wikidata item for the electoral district their position is associated with (P768)
The Wikidata item for the parliamentary term this position is associated with (P2937)

A table with verification results:

The verification status - represented by an enum, but initially UNDECIDED, YES or NO
The wiki user who made the decision
A foreign key to the statement

Web app

This needs to have endpoints that:

Receive verification results (a statement ID and the new verification ID) and records them
Trigger a run of the script to update a particular verfication page
An interface to let us add new pages (the data in the "page" table, essentially)

Script to populate statements for verification from the suggestions-store

The statements for verification will be sourced from the suggestions-store component that @chrismytton has been working on; this should have an endpoint that will return suggestions, at least including:

the country the suggestion is relevant to
some identifier that enables us to infer the position item ID (@mhl to resolve)
person (with an internal ID from our partner)
party (with an internal ID from our partner)
area ID (with our ID)

We have an id-mapping-store component which should be used for looking up the Wikidata IDs for the person, party and area ID. The area ID mapping should always be present, but there may be no Wikidata ID for the person or party if they haven't been found yet via Mix 'n' Match games. My suggestion for what this script should do in cases where the Wikidata ID can't be found is that it shouldn't create rows in the database for such statements, but instead update a count (perhaps stored as a column on the "page" table of items that couldn't be processed because of missing Wikidata IDs - if those counts are non-zero, we could include on the verification page a link to the appropriate Mix 'n' Match game.

Questions

Is there any way we can secure the endpoint that accepts the POST requests with the results of decisions, in order to stop someone from just recording "no" for all statements, or random decisions? (I can see how to make that more awkward, but not how to stop it completely.) This kind of abuse of the system isn't such a problem in terms of the edits made to Wikidata, since that can be handled by Wikidata moderation and reversion tools, but we need to consider potential vandalism of this API separatly.

Future enhancements

It would be great if this component also generated a dashboard or summary page, linking to the verification pages that need most work, indicating progress and with a leaderboard for the users who have contributed most.

As discussed above, we'd want to keep in mind that we may want to make this tool more generic, in the sense of supporting statements other than P39 and a configurable set of qualifiers.

everypolitician / democratic-commons-tasks

create a first version of the verification-pages tool #42