nish-kishore / sirfunctions

Key functions used by the SIR team
Other
0 stars 0 forks source link

extract_country_data() should also include country and prov population data #69

Closed mcuadera closed 5 months ago

mcuadera commented 5 months ago

extract_country_data() contains prov population data. However, it would also be handy to attach the country and district population data as well. This ensures consistency when doing the desk reviews, especially with accessing the data (i.e., once subsetting raw.data, they can just focus on doing their analysis using the dataset from extract_country_data())

On the desk reviews, we currently subset the data like so:

cpop.ctry <- raw.data$ctry.pop |> 
  select(ADM0_NAME, adm0guid, u15pop, year, datasource) |>
  rename("ctry" = "ADM0_NAME") |>
  filter(ctry == str_trim(str_to_upper(country)),
         between(year, year(date_first), year(date_last))
  )

ppop.ctry <- raw.data$prov.pop |> 
  select(ADM0_NAME, ADM0_GUID, ADM1_NAME, adm1guid, u15pop.prov, year, 
         datasource) |>
  rename("adm0guid" = "ADM0_GUID",
         "ctry" = "ADM0_NAME",
         "prov" = "ADM1_NAME",
         "u15pop" = "u15pop.prov") |>
  filter(ctry == str_trim(str_to_upper(country)),
         between(year, year(date_first), year(date_last))
  )

dpop.ctry <- raw.data$dist.pop |> 
  select(ADM0_NAME, ADM0_GUID, ADM1_NAME, adm1guid, ADM2_NAME, adm2guid, 
         u15pop, year, datasource) |>
  rename("adm0guid" = "ADM0_GUID",
         "ctry" = "ADM0_NAME",
         "prov" = "ADM1_NAME",
         "dist" = "ADM2_NAME") |>
  filter(ctry == str_trim(str_to_upper(country)),
         between(year, year(date_first), year(date_last))
  )