markbrough / IATI-Data

Shows aid information from the IATI Registry
http://iatidata.heroku.com
5 stars 2 forks source link

Normalisation of countries and organisations. And other fields? #2

Open markbrough opened 13 years ago

markbrough commented 13 years ago

What else should be normalised - statuses?

How to deal with merging organisations?

aidinfolabs commented 13 years ago

There is a general data normalisation diagram here: http://support.iatistandard.org/entries/20200006-data-normalisation-iati-in-relational-databases which may provide guidance on how to normalise.

An API providing access to the full code-lists will be available soon (giving each code list code, and text in any available languages as XML or JSON) which may help.

In terms of merging organisations - ideally this should happen upstream (i.e. getting publishers to converge on identifiers) or it could make use of a separate table of linkages (not sure how this would work for queries - but it would enable use of external data sources for equivalence or relationships of organisations such as the 'group' feature in Open Corporates: http://opencorporates.wordpress.com/2011/06/01/introducing-corporategroupings-where-fuzzy-concepts-meet-legal-entities/)

markbrough commented 13 years ago

That's really useful. Thanks Tim! OpenCorporates "groups" ideas is a good one. Could possibly fix them by: a) saying organisation 1 is "the same as" organisation 2 - i.e. the grouping idea b) merge organisations - find/replace all references to organisation 1 with references to organisation 2?

aidinfolabs commented 13 years ago

I would go with a relationships table that would allow for:

Org 1 [is identical to] Org 2

and

Org 1 [is in the same group as] Org 2

or

Org 1 [is subsidiary of] Org 2

relationships to capture the different sorts of relationship people might be able to work out.