open-contracting / standard

Documentation of the Open Contracting Data Standard (OCDS)
http://standard.open-contracting.org/
Other
139 stars 46 forks source link

Organization ID #62

Closed michaeloroberts closed 10 years ago

michaeloroberts commented 10 years ago

We follow the IATI Organisation ID Standard for organisational identifiers. Very good to be keeping in sync with other standards like IATI so we can follow the money, but could be a risk given the IATI community debating new approaches to manage the Org ID and things seems somewhat in flux at the moment. Might want to consider a related element that maps to IATI if the IATI record identifier is known.

jpmckinney commented 10 years ago

Perhaps the Organization ID should be an array, to allow for multiple identifiers and schemes? So, there could be DUNS, IATI, GLN, OpenCorporates, local corporate registry, etc.

anderspeders commented 10 years ago

Simply noting that the Budget Data Package team has discussed similar challenges around Org ID. We are keen to hear explore the pro's and con's for various models. Also adding http://publicbodies.org/ and the Open Civic Data as well to the mix: https://github.com/opencivicdata/ocd-division-ids

jpmckinney commented 10 years ago

Can you link to any BDP discussions?

OCD-IDs don't yet identify organizations. For what it's worth, the Sunlight Foundation often adopts the same approach as I described (no primary identifier, just an array of identifiers).

While I like the idea of publicbodies.org, there's very little adoption, it's fairly inactive, and their governance and identifier scheme doesn't guarantee stability of identifiers over time.

anderspeders commented 10 years ago

Sure, one thread here: https://github.com/openspending/budget-data-package/issues/13 Yes, agree that none of these options are ideal.

My sense is that in order to reach a viable solution we'd need dedicated time to be set aside for this. But clearly OC standard, budget data package and IATI are all struggling with the same issue here.

jpmckinney commented 10 years ago

Can you describe what the issue is with simply letting people use whatever identifier scheme makes sense in their jurisdiction - as long as they identify their identifier scheme? Maybe it's DUNS, GLN, corporate identifier, etc. Whatever the case, the identifier will be unique within the scope of its scheme.

Inventing a new public DUNS-like system would not be realistic or timely.

anderspeders commented 10 years ago

That is possibly the best solution :-)

I am not sure if IATI has made any final decision wrt. how to assign org ID going forward?

Agree that inventing a new system is unrealistic.

practicalparticipation commented 10 years ago

I think it's worth separating two issues here.

The IATI Organiastion ID 'Convention' - Keep IDs as a simple string, re-use existing identifiers, prefix them with a 'registration agency' code)

and

The Public Organisations challenge - often government depts are not 'listed' in any stable source, so the IATI Organisation ID Convention fails to provide good quality stable identifiers for these.

On the Organisation ID Convention in use: It looks like there is reasonable consensus in IATI to stick with the compound string for a primary identifier (i.e. not splitting an identifier in <organisation @vocab="GB-COH" @id="123567"> but keeping something like <organisation @id="GB-COH-123567">) and plans to potentially then in addition to a primary identifier, introduce space for 'secondary identifiers' which may take a number of types, including alternative IDs, and previously known IDs.

However, where there are gaps in the current IATI Organisation ID Convention which do affect Open Contracting are:

On the public organisations challenge My view is that we need some scoping work shared between various projects working in this area to identify what a minimal ongoing service to provide stable identifiers for public bodies might look like - which might involve revitalising PublicBodies.org, or it might involve some other approach. But, this doesn't have to be solved before adopting a convention for compound string IDs.

Summary

My view is that adopting IATI Organisation ID Convention (disclosure: I wrote it for IATI) for a *primary organisation identifier is helpful for many use cases, including flat presentation of data - and OC should work with IATI (and others) to iron out remaining implementation issues.

I would support adding another field for alternative IDs, and perhaps also looking to the work on an 'Other Identifiers' field being explored in IATI 2.01 process

jpmckinney commented 10 years ago

Sure, a convention of concatenating a stable prefix to a locally-unique identifier is fine, and fairly universal across publicbodies.org, OCD-ID, etc. However, as currently written, OCDS splits out the prefix into a scheme field and puts the locally-unique identifier in a uid field. I don't see an issue with that; it's trivial to join the two with a hyphen.

Since OCDS targets governments as adopters, can't it be a responsibility of the governments to choose a scheme, or create a scheme if they lack stable identifiers for government departments, etc.? Why should the opengov community have to populate publicbodies.org (or whatever else) in an error-prone manner?

birdsarah commented 10 years ago

On a slightly broader / philosophical note. The standard requires that three things have unique identifiers - the OCID, the award, the contract. Then there are a host of other things that we would like to be uniquely identifiable but recognize that not everyone is going to adopt an interoperable identification system for all things. So, with @jenit in Berlin we came up with a repeatable way of identifying things that could be used to provide linkable data or could be used as something less than that.

Given that the whole purpose of this more generic identifier is to accomodate publishers who are publishing less-good information, I would be hesitant to follow any standard for identifer.

birdsarah commented 10 years ago

@jpmckinney the array of identifiers is interesting. but it increases complexity and makes flattening harder. I'd be keen to get some supply or demand pull for this before we change anything. From the supply side in our data review we barely saw organizational identifiers, let alone multiple ones.

jpmckinney commented 10 years ago

Sure, we needn't have multiple identifiers. In general, any features that are unimplemented should be cut.

My suggestions (and concerns) around the 1.0 timeline are in #60, which includes a description of a common process for collecting implementations and cutting features.

birdsarah commented 10 years ago

Hi, I've realized that the idea I was talking about of a generic identifier is a bit of a misnomer as, as its washed out, the organization is the only thing that uses it. That's making me feel more relaxed about picking a system that's most appropriate for organizations - I don't have an opinion of what that might be, but am definitely feeling flexible.

LindseyAM commented 10 years ago

@sdavenpo422 might have thoughts.

practicalparticipation commented 10 years ago

Below is draft guidance on Organisational IDs, based on the IATI approach.

I'm still open to suggesting we split this into scheme and ID, rather than compound it.

Identifying organisations

Reliably identifying the legal entities involved in a contracting process is vital for transparency and accountabilty, and for carrying out analysis to improve procurement.

Publishers should collect and record the legal identifier from an official register of any organisations involved in a contracting process, and should include this in their OCDS files.

There are two parts to expressing an organisation identifier in open contracting data.

  1. An organisation register prefix identifying a register in which the organisation is identified
  2. The existing organisational ID provided in that public register

These are combined into a string (with - as the separator).

For example, the organisation register prefix for UK Companies House is GB-COH. The organisation Development Initiatives has been assigned the company number ‘06368740’ by Companies House. The globally unique organisation identifier for Development Initiatives is then ‘GB-COH-06368740’.

Publishers may choose to store this in their databases in two parts (register, and registered ID), or as a single string. As the identifiers coming from third-party systems can contain various characters, including / and -, we recommend that publishers pay careful attention to the validation requirements of any database fields storing these identifiers. Users may find that stripping all non alphanumeric charachters from organisation IDs when analysing data helps avoid missed matches, due to different publishers databases handling of certain characters.

The organisation register prefix is used to refer to a register from which the organisation identifier is drawn. There are a range of different kinds of organisation list:

At present, the OCDS standard defers to the organisation list prefixes provided by the IATI Organisation Registration Agency codelist. If you require codes to be added to this list, please contact the Open Contracting Data Standard support and they will work to achieve this. 

LindseyAM commented 10 years ago

In our star levels - will we differentiate between naming contractors (and other orgs) in text, using local or system specific identifiers, and the advanced approach of a prefixed identifier that you describe above? Copying @marcelarozo for her reference.

bill-anderson commented 9 years ago

re: "At present, the OCDS standard defers to the organisation list prefixes provided by the IATI Organisation Registration Agency codelist. If you require codes to be added to this list, please contact the Open Contracting Data Standard support and they will work to achieve this."

The IATI Tech Team would be happy to receive requests directly: