worldbank / povcalnetR

R client to the Povcalnet API
https://worldbank.github.io/povcalnetR
Other
9 stars 5 forks source link

povcalnet_cl does not return full results when using different poverty lines #18

Closed morkor closed 4 years ago

morkor commented 4 years ago

In more detail the results are as follows (the rest of countries seem to return poverty rates for all survey years when poverty lines other than 1.9 $ are requested--not necessarily the same for each year): [1] "PCN request for ARG returns an empty tibble (own PLs)." [1] "PCN request for BOL returns a tibble with less years (own PLs). Missing: 1992" [1] "PCN request for COL returns a tibble with less years (own PLs). Missing: 1980;1988;1989;1991" [1] "PCN request for ECU returns a tibble with less years (own PLs). Missing: 1995" [1] "PCN request for ETH returns a tibble with less years (own PLs). Missing: 1981" [1] "PCN request for HND returns a tibble with less years (own PLs). Missing: 1986" [1] "PCN request for URY returns a tibble with less years (own PLs). Missing: 1992;1995;1996;1997;1998;2000;2001;2002;2003;2004;2005"

morkor commented 4 years ago

[using a fully updated R version 3.6.1 (2019-07-05) on Ubuntu 18.04.3 LTS]

morkor commented 4 years ago

I think those missing are all Urban based surveys

tonyfujs commented 4 years ago

@morkor Can you send me a reproducible script so I can figure out what is happening? Thanks!

morkor commented 4 years ago

here it is: i <- "ARG" PLs <- c(3.149,2.066,2.704,1.155,1.552,1.737,1.500,2.132,1.848,1.500,1.430,1.375,2.104,2.152,2.095,2.203,1.768,1.669,1.519,1.841,1.918,1.698,1.570,1.424,1.243,1.092) TargetYears <- c(1980,1986,1987,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013) t1 <- povcalnetR::povcalnet_cl(rep(i,length(TargetYears)),PLs,TargetYears) # this is empty although povcanet has data on this country: t2 <- povcalnetR::povcalnet(i) sort(unique(t2$year)[which(!(unique(t2$year) %in% unique(t1$year)))])

i <- "BOL" PLs <- c(1.038,1.145,1.160,1.171,1.173,1.138,1.110,1.130,1.156,1.155,1.182,1.206,1.212,1.259,1.263,1.260,1.261,1.284,1.268,1.266) TargetYears <- c(1990,1992,1997,1999,2000,2001,2002,2004,2005,2006,2007,2008,2009,2011,2012,2013,2014,2015,2016,2017) t1 <- povcalnetR::povcalnet_cl(rep(i,length(TargetYears)),PLs,TargetYears) # note here that 1992 is missing from the tibble compared to: t2 <- povcalnetR::povcalnet(i) sort(unique(t2$year)[which(!(unique(t2$year) %in% unique(t1$year)))])

i <- "COL" PLs <- c(3.008,2.036,1.868,1.81,1.804,1.587,1.616,1.62,1.622,1.66,1.67,1.672,1.705,1.801,1.834,1.828,1.788,1.824,1.815,1.81,1.806,1.799,1.827) TargetYears <- c(1980,1988,1989,1991,1992,1996,1999,2000,2001,2002,2003,2004,2005,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017) t1 <- povcalnetR::povcalnet_cl(rep(i,length(TargetYears)),PLs,TargetYears) # note here that 1980;1988;1989;1991 are missing from the tibble compared to: t2 <- povcalnetR::povcalnet(i) sort(unique(t2$year)[which(!(unique(t2$year) %in% unique(t1$year)))])

... similarly for the others and lastly Uruguay:

i <- "URY" PLs <- c(3.282,2.418,2.839,2.457,2.363,2.225,1.538,1.434,1.571,2.658,2.567,2.419,2.089,2.903,2.798,2.828,2.902,2.885,2.938,2.913,2.927,2.97,2.919) TargetYears <- c(1980,1988,1989,1991,1992,1996,1999,2000,2001,2002,2003,2004,2005,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017) t1 <- povcalnetR::povcalnet_cl(rep(i,length(TargetYears)),PLs,TargetYears) # note here that 1992;1995;1996;1997;1998;2000;2001;2002;2003;2004;2005 are missing from the tibble compared to: t2 <- povcalnetR::povcalnet(i, 1.9,TargetYears) sort(unique(t2$year)[which(!(unique(t2$year) %in% unique(t1$year)))])

tonyfujs commented 4 years ago

Thanks @morkor So the issue comes from the different type of coverages. ARG is the easiest example to showcase this because it only has urban coverage. By default, the PovcalNet API only returns national coverage if coverage is not explicitly specified: This is why you are not getting any data when querying ARG.

You can get what you want by passing the coverage parameter: `i <- "FRA" coverage <- "urban" PLs <- c(3.149,2.066,2.704,1.155,1.552,1.737,1.500,2.132,1.848,1.500,1.430,1.375,2.104,2.152,2.095,2.203,1.768,1.669,1.519,1.841,1.918,1.698,1.570,1.424,1.243,1.092) TargetYears <- c(1980,1986,1987,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013)

t1 <- povcalnetR::povcalnet_cl(rep(i,length(TargetYears)), PLs, TargetYears, coverage = rep(coverage, length(TargetYears)))`

I agree that this is a bit confusing... I will close this issue, but open another to improve the povcalnet_cl() function so it returns a useful message when running into these cases.