akvo / akvo-rsr

Akvo Really Simple Reporting
http://rsr.akvo.org
GNU Affero General Public License v3.0
39 stars 20 forks source link

Akvo Write API - 1 - Importing projects as IATI Standard XML #121

Closed adriancollier closed 11 years ago

adriancollier commented 12 years ago

RSR now has a read-only API that allows the programmatic exporting of a fair amount of its data.

We now want to be able to import project data programmatically into RSR. Tastypie provides for almost seamless importing of XML- or JSON-formatted data through its resources, only needing some extra work around authentication and given that the imported data conforms to how Tastypie formats the data.

We also want to be able to import and export data in the IATI Standard format. It is close enough to the RSR schema that, with some additions, we can meaningfully use the IATI Standard as a means of data exchange.

To be able to use IATI formatted data in RSR we need a transform of some kind. Since Tastypie provides the necessary interface to import XML "all" we need to do is transform the IATI XML to Tastypie XML and it can be able to input it through the API. Exporting RSR data as an IATI activity should also be possible by transforming the Tastypie XML into IATI XML.

Unfortunately these two transforms are not symmetrical, they have to be implemented separately, so we're starting with importing. We create an XSLT transform that is applied to the IATI activity and the result is XML we can feed to Tastypie which processes it and creates the new RSR objects.

The spec of the entire feature is here: https://akvo.teamworkpm.net/notebooks/13783

zzgvh commented 11 years ago

Thoughts and comments on the Cordaid XML

Namespace

The akvo namespace link should only appear in the iati-activity tag and should use the correct URI. In the rest of the document it is enough to use the akvo: prefix to tags and attributes:

<iati-activity xmlns:akvo="http://www.akvo.org/xml/v1/iati/activity" last-updated-datetime="2012-07-30T10:40:47+00:00" xml:lang="en" default-currency="EUR">
    <description type="1" xml:lang="en" akvo:type="Subtitle">
        ...
</iati-activity>

Language

It is very confusing to have a lot of xml:lang="en" attributes while the actual texts are in Dutch ;-) And ideally the main language should be set in the and only exceptions should be noted with the appropriate attribute. Having a general language set in this way also allows us to set the main language for the project in RSR in a systematic way.

Project description

There are problems with some of the akvo:type attributes, e.g. akvo:type="Project Plan". The problem is that the attribute values are case sensitive and in the XSLT we are matching "Project plan". So we could solve this by making absolutely sure all srtings are exactly matching, but it's brittle and inelegant. I suggest that we instead extend the IATI Description Type code list with new codes over the three that are defined:

Code    Name            Description
1       General         Long description of the activity with no particular structure
2       Objectives      Objectives for the activity, for example from a logical framework
3       Target Groups   Statement of groups targeted to benefit from the activity
Akvo extensions:
4       Subtitle        A subtitle with more information on the project
5       Summary         A brief summary of the project
6       Background      Relevant background information
7       Project Plan    Detailed information about the project and plans for implementing: the what, how, who and when
8       Goal Overview   Describe what the project hopes to accomplish
9       Current status  Description of current phase of project
10      Sustainability  Plans for sustaining/maintaining results after implementation is complete

Maybe Code 1, General can be used instead of the Akvo Code 7, Project plan and Code 2, Objectives instead of 8, oal overview? Input from the partner team would be good. One issue with extending the code list is that I have no idea how we express that in terms of extending the standard. Do we extend the code list in the akvo namespace, is that even possible? Or is there some other way to indicate an extension to a standard code list?

Result/Indicator

We need to look at how to be able to process <result> and <indicator> data. Perhaps by adding a code list of our own with the RSR list of new benchmarks and adding them to the indicator-tags as attributes? If we get this working it's a big win since much of the value of a project is shown in these data.

Location

<location> needs to include the country attribute in the <administrative> sub-tag, it's the sanest way to get a country for each location. We could build a fall-back using a geolocating web service, but it's messy. Also the IATI standard requires @country for compliance:

 <location percentage="34">
    <administrative country="CD">CONGO, THE DEMOCRATIC REPUBLIC OF THE</administrative>
    <location-type code="PPLC">Capital of a political entity</location-type>
    <name>Kinshasa</name>
    <coordinates latitude="-4.331667000000000000" longitude="15.313889000000000000" precision="5" />
  </location>

Organisation type

ITAI has an Organisation Type code list with 10 kinds of orgs. Akvo only has four types: Governmental, NGO, Commercial and Knowledge institution. I see two solutions here. Either we create a "translation table" and still use the Akvo types internally or we bite the bullet and use the IATI code list internally.

Translation means we lose fidelity in the data when importing and there are a couple of items on the IATI code list that I don't know how to translate (someone else may have a good answer there tho). Adapting the IATI codes internally means we have to go through all organisations in the system and set the new IATI org type for them.

Here's a table with IATI and Akvo org types

Code    Name                            Akvo org type
10      Government                      ORG_TYPE_GOV
15      Other Public Sector             ORG_TYPE_GOV
21      International NGO               ORG_TYPE_NGO
22      National NGO                    ORG_TYPE_NGO
23      Regional NGO                    ORG_TYPE_NGO
30      Public Private Partnership      ?
40      Multilateral                    ?
60      Foundation                      ORG_TYPE_NGO
70      Private Sector                  ORG_TYPE_COM
80      Academic, Training and Research ORG_TYPE_KNO

Participating-org

<participating-org> needs the ref attribute and it should refer to the IATI organisation identifier of the organisation in question:

<participating-org ref="NL-KVK-41160054" role="Accountable" type="21">Stichting Cordaid</participating-org>

A secondary solution would be to use the primary key value of the org in RSR if IATI id:s are hard to produce:

<participating-org akvo:ref="273" role="Accountable" type="21">Stichting Cordaid</participating-org>
zzgvh commented 11 years ago

More on participating-org attributes

To be able to identify Cordaid's partners that do not have an IATI ID and are not already in RSR it has been proposed to use Cordaid's internal ID for partner organisations. In the latest example of the XML they are indicated using the attribute akvo:relationId. I would like to change that to akvo:internal-ref. There are two reasons: I don't think we should introduce camel-case in a tag soup that is purely lower-case-with-hyphens. And I think internal-ref is a better description of the attribute, realted to the existing ref and akvo:ref attributes. An example taken from the latest XML sample would look like this:

<participating-org xmlns:akvo="http://www.akvo.org" role="Implementing" type="21" akvo:internal-ref="2FC40AAB-90EB-4E07-9931-AEAD790E71F4">Caritas Switzerland</participating-org>
adriancollier commented 11 years ago

I've been doing some checking and thinking at the weekend for this, and I think there are a couple of things that maybe I have requested incorrectly.

I mentioned we needed to use the Put, but I forgot that we had mentioned that. The 2 options I see are:

  1. Use a Function to check all existing projects. Post new ones and Put existing ones to update their details to the most recent.
  2. Load all Projects from the XML as new projects and remove all previously existing projects that would otherwise cause a duplication.

I think we'll need to go with 2 - and then just ensure that the Project Updates are moved to the new project where needed.

The next thing was to ensure that we are using the right file: https://www.dropbox.com/s/j5x7ogl0dqd1uin/iati_export0.xml

During the last attempt there were things I ended up doing manually on the project content. I know the updates will probably still need to be moved, but we should try to ensure the others can be included:

  1. Photos per project
  2. Benchmarks and Goals added from the file
  3. Updates (are in the right place post-migration)

Other tasks that we need to resolve:

  1. Testing for the release we have to put in (almost complete)
  2. Put the release live
  3. Contact Marko to inform of the projects that need to be removed from the Synchronisation
  4. Add any manual changes since Cordaid started to use Test2 (Charlotte)

What we probably need to push back until after this migration:

  1. Additional location info such as City.
  2. Logos
  3. Anything else I have forgotten
adriancollier commented 11 years ago

OK, I have made the changes as necessary to the Live site. https://www.dropbox.com/s/29napw4arknfnp1/Test2_to_Live_Migration.xlsx

All projects that are in the file AND already existing I have set to Unpublished and the Internal ID is removed (GREEN) All projects that are NOT in the file AND exist already, I have removed the IDs so they will not show up (ORANGE) All projects that are NOT in the file AND exist already AND need to be removed I have set to Unpublished (PINK) All projects that are NOT in the file AND need to be confirmed by C4C before removal I have removed the IDs so they will not show up (RED) All projects that SHOULD NOT be published, but ARE IN the file need to be removed afterwards (BLUE)

adriancollier commented 11 years ago

FYI the Sync file info for Cordaid:

This is the template Sync details to be copied and completed into a new Message here for each Sync that is carried out from Daisy to RSR:

adriancollier commented 11 years ago

Seeing as all the remaining features have been requested in:

225 API Security

226 API Validation

224 API Put Functionality

136 API Caching

140 Benchmarks, Indicators and Categories

We can close this issue.

There is a lot of useful information here though, so we I will start the work on pulling this into more recognisable documentation.

Thanks especially @zzgvh for all your hard work on this