popolo-project / popolo-spec

International legislative data specifications
http://www.popoloproject.com/
99 stars 18 forks source link

Add identifiers to classes with a single identifier #84

Open KrzysztofMadejski opened 9 years ago

KrzysztofMadejski commented 9 years ago

I really like the concept of identifiers property in organization and people. It allows to include many IDs which is very often the case (web id, some social id numbers, etc.)

"identifiers": {
"description": "Issued identifiers",
"type": "array",
"items": {
"$ref": "http://www.popoloproject.com/schemas/identifier.json#"
}
},

I wonder why you're not coherent and in vote-event there is:

"identifier": {
"description": "An issued identifier",
"type": [
"string",
"null"
]
},

Multiple identifiers is much more robust approach and I propose to include that in all Popolo classes (same as sources).

CC: https://github.com/KohoVolit/api.parldata.eu/issues/5

jpmckinney commented 9 years ago

Motion, VoteEvent and Area have a top-level identifier property, because when consulting with stakeholders, an important use case was to identify the primary identifier for motions, etc. For example, many parliaments number their motions and vote events. Reporters, researchers, etc. refer to those motions and vote events by that number. So, it was important for that identifier to have a special status with respect to the others and to be more easily accessed.

When we discussed multiple identifiers, no one had an example use case. If two people have a use case, we will consider adding it though!

This is similar to how Person has a top-level email property, although contact_details could also store the email. Contacting a person via email is such a common use case that it was promoted to top-level - many use cases only track email and no other contact details.

In terms of adding it to all classes - Popolo avoids adding a property unless there is a clear use case for it.

akuckartz commented 9 years ago

@KrzysztofMadejski Did you consider using JSON-LD ?

KrzysztofMadejski commented 9 years ago

@akuckartz I don't see how JSON-LD is connected to my question. Could you elaborate on that?

@jpmckinney: Thank you for describing your process. My use case is:

My workaround for now is to keep parse sources url everytime. It would be nice option to have it extracted in separate field, because this identifier is referenced in different context that whole sources url.

akuckartz commented 9 years ago

@KrzysztofMadejski Maybe I misunderstand your original issue, but with the JSON-LD serialization of you get multiple identifiers for free.

jpmckinney commented 9 years ago

@KrzysztofMadejski Can you give me an example of a secondary identifier from your use case? I'm not sure how source ID differs from source URL, so an example will help.

tmtmtmtm commented 9 years ago

As an example of where multiple identifiers are needed, consider migrating the constituencies.json file from parlparse (https://github.com/mysociety/parlparse/blob/master/members/constituencies.json) to Popolo Areas.

jpmckinney commented 9 years ago

Can you clarify what that file illustrates? I see multiple names, not identifiers.

tmtmtmtm commented 9 years ago
    "hansard_id": "5",
    "id": "uk.org.publicwhip/cons/1",

these are both external identifiers

jpmckinney commented 9 years ago

Wouldn't id map to Popolo's id (since it's just an internal ID), and hansard_id can map to identifier (which is an external ID)?

tmtmtmtm commented 9 years ago

I'm not sure what you mean by 'internal' here. A PublicWhip ID is external to anything I'm doing.

But even if that weren't so, how would I then extend this to say that Bristol West is not only uk.org.publicwhip/cons/93, and hansard:104, but also ONS E14000602, or add an OCD division ID if/when someone adds the UK?

jpmckinney commented 9 years ago

Ok, makes sense.

jpmckinney commented 7 years ago

@davewhiteland