openENTRANCE / Model-linkage-Legacy

Apache License 2.0
2 stars 2 forks source link

Region naming convention #21

Open danielhuppmann opened 4 years ago

danielhuppmann commented 4 years ago

The current information in the README.md about the region convention does not follow what was previously discussed by email in the WP4/WP5 steering group.

1) I think it is more intuitive to use full-country names (Germany) rather than ISO3 codes (DE).

2) There will obviously be multiple (multi-country) regional aggregations, which will overlap and possibly are in conflict. I think it is not a smart strategy to hard-code (as in the example in the readme) whether Germany is in "CCE" or "Western Europe". It will be sufficient to name regions as "Germany" (without any hierarchy), "Bayern" (afaik there is no other region/country/province/city in Europe called Bayern), or "Munich" (as above). All we need to make this work is a list of all used terms in a machine-readable format, with a clear structure indicating whether the term is a region (including which countries), country, province, city.

3) In addition to this list, we can provide a list of mappings from alternative spellings (Slovakia) to the common spelling (Slovak Republic).

erikfilias commented 4 years ago

I would like to thank you for the valuable suggestions.

  1. I agree. I will change.

  2. I think to include continent and country in the spell has no-sense, but both of them have to be referenced to the NUTS levels in the dictionary. On the other hand, I was thinking to use the codes of the NUTS levels in the spell to have a machine-readable format. Do you agree with the use of these codes?

  3. I agree. I will provide a list of possible synonyms.

danielhuppmann commented 4 years ago

Not sure what you mean with 2. All you need to do is refactor RegionsEuropa_Dictionary.csv to a machine-readable format (technically, a list like Austria, Czech Republic, Germany, Hungary, Liechtenstein, Poland, Slovakia, Switzerland can be parsed but it's not really best-practice to do it this way), and then allow to add more lines, e.g., Western Europe includes Spain, Portugal, France, Germany, .....

About 3 - you don't need to provide a full list of synonyms, you just need to provide a start, with one or two examples to make it clear how the format works. And then it's up to users to pull-request additional synonyms when they need them.

PS: we can worry later about adding some Python code to check for duplicates and add integration tests, so that no pull-request can introduce duplicates by accident.

erikfilias commented 4 years ago
  1. Ok. I changed the CSV file in a list. I'll try to put all the information readable for a machine in yaml format. Whilst the README.md will describe each part of the dictionary.