IATI / D-Portal

http://d-portal.org/
Other
30 stars 23 forks source link

Various country codes have incorrect country names associated #481

Closed andylolz closed 6 years ago

andylolz commented 6 years ago

See for example Georgia: http://www.d-portal.org/ctrack.html?search&country=GE#view=main

…and Anguilla: http://www.d-portal.org/ctrack.html?search&country=AI#view=main

Error introduced in 80703689, and spotted by @elisedufief.

The data in question is derived from this table on wikipedia: https://en.wikipedia.org/w/index.php?title=ISO_3166-1_alpha-2#Decoding_table

…And is scraped by this d-portal code. The bit that’s splitting on : is doing the wrong thing, as a result of (I think) these wikipedia page updates. There’s also a bunch of “; unassigned” suffices that relate to a change to the wikipedia formatting of these titles.


In case it’s useful, I have an ISO3166-1 alpha-2 scraper running here: https://morph.io/andylolz/country-codes

…that scrapes from source. At some point, I might try and integrate it into https://github.com/datasets/country-codes

Alternatively, you could use the IATI Country codelist (and if it’s out of sync, send a PR or hassle them to fix it).

notshi commented 6 years ago

Many thanks, @andylolz and @elisedufief - we'll merge and update when import is done (it's going a bit slow at the moment so we're waiting for it to finish before pushing a change to hopefully fix that).

xriss commented 6 years ago

Should all be fixed now.

andylolz commented 6 years ago

Sorry, I just realised/remembered I didn’t say thanks for sorting this so quickly! Thank you