OSeMOSYS / osemosys2iamc

MIT License
3 stars 11 forks source link

UK not recognised as region by iso3166 package #33

Open willu47 opened 1 year ago

willu47 commented 1 year ago

The EU uses UK instead of GB as the 2-character code for the United Kingdom. This mapping is not included in the iso3166 Python package.

danielhuppmann commented 1 year ago

This is also true for Greece. We’ve made a hacky fix in the openentrance country definitions, see here, and in the utility method, see here.

willu47 commented 1 year ago

Thanks @danielhuppmann - we've removed the openentrance package as a dependency to avoid #24, in lieu of using a smaller package iso3166; but it's flagged some inconsistencies between the two. For example, the country name in iso3166 for the UK are "United Kingdom of Great Britain and Northern Ireland" and not "United Kingdom" as it is in openentrance...

danielhuppmann commented 1 year ago

Right, that was rather meant as a reference point for possible solutions. The openentance utility will not be maintained, but instead be re-built in a more stable manner in other packages.

I'm having a similar problem with international country-names, where we use the pycountry package as a dependency of nomenclature. This PR https://github.com/IAMconsortium/nomenclature/pull/262 adds a list of hard-coded overrides for readability and consistency with community standards.

willu47 commented 1 year ago

Okay, so a more direct question for you on this topic. Could you tell me what the recommended package or utility is to ensure compatibility with the IAMC template (and nomenclatures used within ECEMF and IAM-COMPACT), where the only function we need is to convert from ISO2 or ISO3 letter country codes.

Options:

  1. Force models to use iso2 or iso3 in the iamc template (pycountry or py3166 should do the job) [my preferred option]
  2. Enforce standard of official country names (need a mapping of country names accepted by ecemf/iam-compact to iso2/3)
danielhuppmann commented 1 year ago

For clarification: the IAMC data format just requires a "region" column.

The "nomenclature", i.e., the actual definitions of allowed values in each project, is defined for each project and are currently not always consistent across projects.

In openENTRANCE and ECEMF, the list of allowed country names is explicitly defined here - this list currently only includes European countries. I'm not sure where such data would exist for iam-compact?

I don't like the idea to force results to be reported directly in ISO3 codes, because this is very unwieldy for users.

The current effort in the nomenclature package aims to have a standardized reference for all country names that can be used by any future projects, and we could add conversion tools between name and ISO2/ISO3 based on the pycountry-package (which seems to be more stable than py3166) with some agreed simplifications and addition(s), in particular Kosovo.

There are currently four naming conflicts between pycountry and openENTRANCE:

'Czech Republic', 'Moldova', 'Russia', 'The Netherlands'

FYI: Kosovo is not a universally recognized country and does not have an ISO3 code.