choice of unique identifier

popolo-project / popolo-spec

International legislative data specifications

http://www.popoloproject.com/

99 stars 18 forks source link

choice of unique identifier #6

Closed jpmckinney closed 11 years ago

jpmckinney commented 11 years ago

MongoDB will automatically assign a 12-byte ID, which comes out to a 24-character hexadecimal string. PopIt uses standard MongoDB IDs. Billy sets the ID to the uppercase jurisdiction code, e.g. "CA" for California or "PA-PHILADELPHIA" for Philadelphia, followed by a one-letter code for the document type, e.g. "L" for legislators, and a six-digit number.

Popolo currently has no recommendation for identifiers. Systems may choose any identifier scheme.

evdb commented 11 years ago

From the perspective of PopIt we will probably make all ids be a url, so that results from several PopIt data stores can be used in an application. Perhaps it is worth saying something in the spec about ids being usable like this (so both universally unique and acting as an address to the canonical source).

In the spec it currently says that the id in the JSON serialisation should be the MongoDB id. Perhaps another field is required to store a public id for an entry? For us that would be the URL above.

jpmckinney commented 11 years ago

I think it would be fine for the JSON id and the MongoDB _id to have different values. Do you think we should require the JSON id to be a URL to the JSON document? I'm OK with that.

jpmckinney commented 11 years ago

FYI, Sunlight is working on a new identifier scheme. Hopefully details will follow.

jpmckinney commented 11 years ago

Thinking about it a bit more: I think id should be the actual ID. There are many hypermedia standards now, not sure which is winning, but HAL would have you do:

{
  "_links": {
    "self": { "href": "http://example.com/people/123" }
  },
  "id": "123",
  "name": "Mr. John Q. Public, Esq."
}

Popolo can recommend a particular hypermedia standard.

jpmckinney commented 11 years ago

Popolo still has no recommendation for identifiers. I've added a section to a new "software component" page about the choice of identifiers, recommending URLs, but offering other alternatives mentioned here. To be pushed in coming days.