OCHA-DAP / Data-Team

A place for tracking data team issues
0 stars 1 forks source link

Determine method for using country names and ISO3 codes #39

Closed luiscape closed 10 years ago

luiscape commented 10 years ago

We currently have a few inconsistencies with the way we assign names and country codes. I did a comparison between our terminology and ReliefWeb's and highlighted some differences here:

https://docs.google.com/spreadsheets/d/1ZzjQY1LLF0SpyG6dU7keAKtDbpRxf4gOeiL-c3Jb0wE/edit#gid=0

To fully solve this issue, I suggest the following method:

  1. Use the official names provided by UNTERM. (A) We have scraped them already and (B) They come in the 6 official UN languages.
  2. Use discretionary "Short Names" according to our needs. We are responsible for determining those names. Our system could benefit from using "Venezuela" instead of "the Bolivarian Republic of Venezuela".

Once we define that method I'll create an official "dictionary" of names for the project.

luiscape commented 10 years ago

@takavarasha Also let me know what you think.

JavierTeran commented 10 years ago

Great analysis. Please see my comments on Issue #38. Godfrey already standardized all focus countries in HDX repository. We did only long names but your proposal of short names is great. What you think @takavarasha ?

luiscape commented 10 years ago

The method for official names using http://unstats.un.org/unsd/methods/m49/m49alpha.htm sounds appropriate. That would be the right thing to do in my opinion.

As of "Short Names" we could implement that and use whatever we want -- or whatever makes sense.

Finally, if translation is necessary, I suggest using UNTERM for the other 5 languages that aren't in the http://unstats.un.org/unsd/methods/m49/m49alpha.htm list.

How does that sound?

JavierTeran commented 10 years ago

Sounds good to me, @takavarasha do you agree?

luiscape commented 10 years ago

@JavierTeran @takavarasha -- Sorry for being pushy on this issue, but I see this as an essential standard to have -- and an easy win. What would be an update on this issue? And what are the tasks that need to be completed?

JavierTeran commented 10 years ago

The agreed spelling of countries names (short and long form) will be defined here: https://docs.google.com/spreadsheet/ccc?key=0AoSjej3U9V6fdHJzcWNreF8tVDNXTlpaeXl3Z3h3WWc&usp=drive_web#gid=15 (column in orange) Still work in progress, will update it in the following days.

luiscape commented 10 years ago

@JavierTeran just checked the Gspreadsheet and it does seem that it needs some work. Let me know if you need help.

JavierTeran commented 10 years ago

@luiscape Yes, please can you jump in there when you have a minute. Thanks

JavierTeran commented 10 years ago

We decided we will be using UNSD country codes as default in HDX.