vincentarelbundock / countrycode

R package: Convert country names and country codes. Assigns region descriptors.
https://vincentarelbundock.github.io/countrycode
GNU General Public License v3.0
346 stars 84 forks source link

add dhs script #218

Closed mcooper closed 4 years ago

mcooper commented 4 years ago

Hello,

Thanks for this fantastic package, I use it all the time.

One data source I frequently work with are the Demographic and Health Surveys, and they have their own unique country code format. It would be great to add their schema to the package.

Thanks!!

cjyetman commented 4 years ago

This is cool, thanks! The build failed, but it looks like it was for a completely unrelated reason... Error: package ‘R6’ was installed before R 4.0.0: please re-install it

cjyetman commented 4 years ago

In case anyone wonders, there is some significant difference between this and iso2c...

library(jsonlite)
library(dplyr)
library(countrycode)
url <- 'https://api.dhsprogram.com/rest/dhs/countries?select=DHS_CountryCode,CountryName&f=json'
fromJSON(url)[['Data']] %>%
  dplyr::select(CountryName, DHS_CountryCode) %>% 
  mutate(iso2c = countrycode(CountryName, 'country.name', 'iso2c')) %>% 
  filter(DHS_CountryCode != iso2c)

#>             CountryName DHS_CountryCode iso2c
#> 1              Botswana              BT    BW
#> 2               Burundi              BU    BI
#> 3    Dominican Republic              DR    DO
#> 4           El Salvador              ES    SV
#> 5     Equatorial Guinea              EK    GQ
#> 6             Guatemala              GU    GT
#> 7                 India              IA    IN
#> 8            Kazakhstan              KK    KZ
#> 9       Kyrgyz Republic              KY    KG
#> 10              Liberia              LB    LR
#> 11           Madagascar              MD    MG
#> 12              Moldova              MB    MD
#> 13              Namibia              NM    NA
#> 14            Nicaragua              NC    NI
#> 15                Niger              NI    NE
#> 16 Nigeria (Ondo State)              OS    NG
mcooper commented 4 years ago

Just FYI after seeing @cjyetman print the first 16 lines, I realized there is a duplicate in there. There was one instance of a survey conducted in Ondo state Nigeria, so that survey had its own code. I filtered out that one, and checked that all the rest are at the country level and there are no duplicates.

vincentarelbundock commented 4 years ago

Thanks a lot @mcooper !

I merged the script and rebuilt the dictionary. If you could please try the Github Master version, I could then upload to CRAN:

remotes::install_github('vincentarelbundock/countrycode')
library(countrycode)
countrycode(c('Afghanistan', 'Namibia', 'Liberia'), 'country.name', 'dhs')
vincentarelbundock commented 4 years ago

Thanks @cjyetman for comparing the two sets. Very useful!

mcooper commented 4 years ago

Just tried it locally on a project I'm in the middle of, it works great. Thanks for updating the package so quickly!

Mat

On Thu, May 7, 2020 at 1:53 PM Vincent Arel-Bundock < notifications@github.com> wrote:

Thanks @cjyetman https://github.com/cjyetman for comparing the two sets. Very useful!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vincentarelbundock/countrycode/pull/218#issuecomment-625405728, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEAGRK7EC2HWOELE47DXHS3RQLYTDANCNFSM4M3PYLZA .

vincentarelbundock commented 4 years ago

This is much easier when people submit actual working code ;)

vincentarelbundock commented 4 years ago

Screen Shot 2020-05-07 at 15 19 46