vincentarelbundock / countrycode

R package: Convert country names and country codes. Assigns region descriptors.
https://vincentarelbundock.github.io/countrycode
GNU General Public License v3.0
342 stars 83 forks source link

How to contribute a new table with country names? #284

Closed iago-pssjd closed 3 years ago

iago-pssjd commented 3 years ago

I have this doubt since in the main README Contributions section there are the 2 options Adding a new code or Custom dictionaries. I would expect that contributing a new table with another variant of country names would be through the first option, but I see that Global Burden of Disease country names are included as a Custom Dictionary. Then, what should I do? Why are the GBD names as a custom dictionary and not as a code?

Thanks!

cjyetman commented 3 years ago

I believe because the only unique thing about the Global Burden of Disease data is region names, and since region names are a dime-a-dozen plus region names can only be uni-directional (country->region, but not region->country), we've resisted adding innumerable variations of region name datasets.

vincentarelbundock commented 3 years ago

That is also my recollection. We are 100% open to including new and distinct country CODES, but would need very good arguments to add new NAMES or REGIONS.

iago-pssjd commented 3 years ago

Thanks both for the answers. I thought one of the interesting things of these package is that it allows transforming in a fast way the codes or names in a table/data.frame to the ones in other, in order to make merges easier and faster.

Indeed, thinking just in what I understand would be a unique code, english country names, there are lots of variants, as could be the one codified in country.name.en, the restricted GBD country names or many others. Some use Republic of whatever, others whatever, Republic of, others whatever (Republic of), or some use and, while others use &, or etc.. Therefore, I thought -probably naively- the more, the merrier.

In conclusion, I will add just to my fork new names in which I could be interested (I do not believe these could have interest for you). On the other hand, I will open soon a new issue, because I get an error when running dictionary/build.R.

Thanks again!

vincentarelbundock commented 3 years ago

Thanks for understanding.

Personally, I think it's a bad idea to maintain your own fork, because you might not benefit from future improvements.

A better alternative might be to use the countrycode_factory function available on GitHub now and in the next CRAN release. See the README for examples.

But to each its own, of course.

iago-pssjd commented 3 years ago

I agree, but I am not sure how should I use countrycode_factory. Should I build a new function (my own, without modifying the package) after downloading the database and treat it as a (personal/local custom) dictionary?

Thanks!

vincentarelbundock commented 3 years ago

I see that this could be confusing. Frankly, you might just want to define your own function that calls:


f = function(x) {
countrycode(x, "iso3c", "newname", custom_dict=yourdata)
}
ˋ``
cjyetman commented 3 years ago

In cases where I want to convert to a specific, custom set of country names, I usually do something like this...

library(countrycode)
data <- data.frame(iso = c("US","DE","ES"), value = c(1,2,3))

data$name <- countrycode(data$iso, "iso2c", "country.name")

custom_country_names <- 
  data.frame(
    iso2c = c("US","DE","ES"), 
    name = c("The United States of America", "Deutschland", "España")
    )

data$custom_name <- 
  countrycode(data$iso, origin = "iso2c", destination = "name", 
              custom_dict = custom_country_names)

data
#>   iso value          name                  custom_name
#> 1  US     1 United States The United States of America
#> 2  DE     2       Germany                  Deutschland
#> 3  ES     3         Spain                       España
iago-pssjd commented 3 years ago

Thanks both!