CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.13k stars 18.43k forks source link

ISO 3166-2 country codes #372

Open ehog90 opened 4 years ago

ehog90 commented 4 years ago

I think adding ISO 3166-2 country codes to the CSVs (or optionally the region codes) would be useful.

AymericBou commented 4 years ago

I did it for my project, I can share it : excel format : here csv format (pipe delimited) : here

CSSEGISandData commented 4 years ago

Thank you for the suggestion. We have received similar requests and will keep them in mind when building out future capabilities.

Bost commented 4 years ago

See https://github.com/Bost/corona_cases/issues/1

(And to put all issues relevant to this subject in one bag: https://github.com/CSSEGISandData/COVID-19/issues/105)

Eclipsed830 commented 4 years ago

The problem with using ISO country codes is they are politicized as being involved with the ISO requires UN Membership. This is why using ISO 3166-2 country codes and naming is considered bad practice and most software developers instead use Unicode CLDR. http://cldr.unicode.org/translation/displaynames/country-names

https://github.com/unicode-cldr/cldr-localenames-full/blob/master/main/en/territories.json

pmdci commented 4 years ago

It is not true that only UN members can have ISO3166-1 codes. Taiwan (ROC) is not an UN member and it does have an ISO 3166-1 code.

Using custom codes that aren't from the USER-ASSIGNED pool (like DP for Diamond Princess) are likely to cause issues. When adding codes for NON 3166-1 locations (e.g. Diamond Princess), make sure to use USER-ASSIGNED code elements for Alpha-2 and Alpha-3.

See: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#User-assigned_code_elements https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3#User-assigned_code_elements

So you could use something like ZZ as Alpha-2, or anything starting with X. There are 43 possible codes that are reserved for custom usage.

As for numeric does, no number range is currently reserved. I tend to assign ISO3166-1 to SMALLINT in SQL Server (up to 32,767), so you could assign values over 30,000 or perhaps negative values. I strongly suggest AGAINST using negative values though.

Eclipsed830 commented 4 years ago

It is not true that only UN members can have ISO3166-1 codes. Taiwan (ROC) is not an UN member and it does have an ISO 3166-1 code

Taiwan's full name using 3166 is "Taiwan, Province of China" tho...

Bost commented 4 years ago

So you could use something like ZZ as Alpha-2, or anything starting with X. There are 43 possible codes that are reserved for custom usage.

Please use different codes than ZZ, and XX:

Thanks

pmdci commented 4 years ago

That's good to know, but I wonder which Wikipedia article is correct. According to the ISO3166-1 article:

User-assigned code elements are codes at the disposal of users who need to add further names of countries, territories, or other geographical entities to their in-house application of ISO 3166-1, and the ISO 3166/MA will never use these codes in the updating process of the standard. The following codes can be user-assigned:[12]https://en.wikipedia.org/wiki/ISO_3166-1#cite_note-13

Also according to the article you pointed out, XX is in the OC (Oceania) continent, which doesn't make much sense (shouldn't the continent be null?). Seems like it is pointing to a specific disputed territory in the OC continent.

Also what about those continent codes? Are they from an ISO standard? Which one? Who maintains this list? How widely adopted is this list as a standard? Not trying to discredit your suggestion but if this is a widely adopted standard I'd like to know (I might have interest to use it myself!)

Bost commented 4 years ago

@pmdci we've been working on this since quite a long time https://github.com/ExpDev07/coronavirus-tracker-api/blob/master/app/utils/countrycodes.py have a look or even better create a PR please.

pmdci commented 4 years ago

I said my peace and you guys seem to have your own way of doing things. Rather than a PR that is likely to just throw fuel at the fire, I just created a mapping table at our end to handle non 3166-1 gibberish.