IATI / IATI-Codelists-NonEmbedded

IATI Codelists that are 'non-functional' and usually provide lookup information.
http://iatistandard.org/codelists/codelist-management/
Other
3 stars 19 forks source link

DAC Codelist update from 28-02-2020 #326

Closed akmiller01 closed 4 years ago

andylolz commented 4 years ago

There are lots of instances of withdrawn codes being removed completely in this PR. Withdrawn codes should not be removed

akmiller01 commented 4 years ago

Thank you, @andylolz ! Working with a bit of inherited conversion code and the above was literally the result of a misplaced indentation in a bit of code that was trying to add the "withdrawn" status where a code was omitted:

From:

for key, element in iati_codes.items():
        if key not in dac_codes.keys():
            if element.attrib['status'] != 'withdrawn':
                element.attrib['status'] = 'withdrawn'
                dac_codes[key] = element

To:

for key, element in iati_codes.items():
        if key not in dac_codes.keys():
            if element.attrib['status'] != 'withdrawn':
                element.attrib['status'] = 'withdrawn'
            dac_codes[key] = element
andylolz commented 4 years ago

Fix sounds good to me.

Working with a bit of inherited conversion code

Interesting – is the conversion code public, at all?

In case it’s of use, the code we are using to do this same task for codelists.codeforiati.org is here: https://github.com/codeforIATI/codelist-updater/blob/master/importers/helpers/__init__.py

akmiller01 commented 4 years ago

Fix sounds good to me.

Working with a bit of inherited conversion code

Interesting – is the conversion code public, at all?

In case it’s of use, the code we are using to do this same task for codelists.codeforiati.org is here: https://github.com/codeforIATI/codelist-updater/blob/master/importers/helpers/__init__.py

Yep absolutely. I've hosted it here https://github.com/akmiller01/DAC-Codelists but I don't know if it would be more appropriate to place it elsewhere.

It does seem like parts of it have already been sourced from your Codelist Updater.

andylolz commented 4 years ago

Great – many thanks for sharing this!

It does seem like parts of it have already been sourced from your Codelist Updater.

Seems very possible that they have a shared ancestor. The codelist-updater code is based on the code in this pull request, which in turn is based on the code in this pull request.

I don't know if it would be more appropriate to place it elsewhere.

I’d suggest moving to an IATI owned repo.

stevieflow commented 4 years ago

Don't forget all the previous work around this:

https://github.com/IATI/IATI-Codelists-NonEmbedded/pull/51 https://github.com/IATI/IATI-Codelists-NonEmbedded/pull/172

andylolz commented 4 years ago

It appears as though missing withdrawal dates have been set to 28th Feb 2020 throughout. I think that could potentially cause confusion. Many of these codes were withdrawn long before that date. Where the withdrawal date is not known, I think it’s probably safer to leave it blank.

PetyaKangalova commented 4 years ago

@andylolz thanks for all your comments. On the dates, we indeed discussed this with Alex and now waiting to hear back from the OECD if they can specify withdrawn dates where they were missing. Will provide an update here once I get a response.

akmiller01 commented 4 years ago

Attaching documents from the OECD. One spreadsheet which has activation/withdrawn years for specific sector codes. And two PDFs with withdraw decisions for the 151 and 920 series:

oldPurposecodes.xlsx

https://one.oecd.org/document/DCD/DAC/STAT(2010)6/en/pdf https://one.oecd.org/document/DCD/DAC/STAT(2012)3/en/pdf

andylolz commented 4 years ago

This is great! Great work getting the activation and withdrawal dates from the DAC :tada:

It looks like those dates were added to the source XML by the DAC – is that correct?

With these latest improvements, it looks like the XML may have overtaken the XLS as the canonical source for DAC codelist data!

Worth noting that some codelists are still only present in the DAC XLS, and not the XML. I think the only one that IATI replicates is Region. It would be great to get this added to the XML.

akmiller01 commented 4 years ago

Hi Andy, the OECD did not have time to add the new dates to their source XML, so I manually went into the output and added them according to the XLSX and PDF documents attached in a comment above. We'll reach out to them about the French translation you noted.

You can preview some of the new codes in the SSOT here: http://dev.reference.iatistandard.org/203/codelists/Sector/

I believe all looks good on our end, so shortly we'll be deploying that to http://reference.iatistandard.org as well.