Priesemann-Group / covid19_inference

Bayesian python toolbox for inference and forecast of the spread of the Coronavirus
GNU General Public License v3.0
73 stars 70 forks source link

Normalize country names to iso 3166 #14

Closed joaopn closed 4 years ago

joaopn commented 4 years ago

In order to automatize country analysis later on, we'll need to deal with the different ways countries are named in different sources. South Korea for instance is called "South Korea" (google), "Korea, South" (JHU) and "Republic of Korea" (apple). What about enforcing e.g. iso 3166 naming during download?

jdehning commented 4 years ago

Yep, but normalize all the countries take some time. Do you know a fast way to do it?

joaopn commented 4 years ago

Not really. Best I can think of is checking if the names are in the iso 3166 list, and implementing a manual translation dict for those that are not, and giving a warning if the name is not in either.

jdehning commented 4 years ago

So, the way to go, if someone wants to implement it, is: