expersso / BIS

Programmatic access to BIS data
19 stars 10 forks source link

get_bis() drops information during conversion #3

Open stefanangrick opened 5 years ago

stefanangrick commented 5 years ago

Summary: When downloading the "Triennial Survey statistics on turnover" and "OTC derivatives outstanding" using get_bis(), information is lost during conversion of the original datasets. For example, the columns on "denomination leg 1" and "denomination leg 2" appear to be dropped, leading to incomplete datasets being imported.

Steps to reproduce:

library("BIS")
ds  <- get_datasets()

der <- get_bis(ds$url[grep("Triennial Survey statistics on", ds$name)])
otc <- get_bis(ds$url[grep("OTC derivatives outstanding", ds$name)])

Expected result: The calls to get_bis() should run through and return the full datasets, with all columns intact.

What happens instead: The data is downloaded and assigned to the objects before the "<-" operator. However, during the internal conversion process, information appears to become lost. To see this, compare the output of the colnames() commands below.

download.file(ds$url[grep("Triennial Survey statistics on", ds$name)],
              mode = "wb", destfile = "full_BIS_DER_csv.zip")
unzip("full_BIS_DER_csv.zip")
der2 <- read.csv("full_WEBSTATS_DER_DATAFLOW_csv.csv", header = TRUE,
                 stringsAsFactors = FALSE, skip = 10)
colnames(der)
colnames(der2)

download.file(ds$url[grep("OTC derivatives outstanding", ds$name)],
              mode = "wb", destfile = "full_bis_otc_csv.zip")
unzip("full_bis_otc_csv.zip")
otc2 <- read.csv("WEBSTATS_OTC_DATAFLOW_csv_col.csv", header = TRUE,
                stringsAsFactors = FALSE)
colnames(otc)
colnames(otc2)

Note that the function get_bis() also returns the below error message before completing:

Warning message:
In evalq(as.numeric(obs_value), <environment>) : NAs introduced by coercion

This message also appears when downloading the datasets "Global liquidity indicators", "Credit to the non-financial sector" and "Property prices: selected series".

System info:

sessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BIS_0.2.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0       rstudioapi_0.8   bindr_0.1.1      xml2_1.2.0       magrittr_1.5    
 [6] hms_0.4.2        tidyselect_0.2.5 rvest_0.3.2      R6_2.3.0         rlang_0.3.0.1   
[11] httr_1.3.1       dplyr_0.7.8      tools_3.5.1      yaml_2.2.0       assertthat_0.2.0
[16] tibble_1.4.2     crayon_1.3.4     bindrcpp_0.2.2   tidyr_0.8.2      purrr_0.2.5     
[21] readr_1.1.1      curl_3.2         glue_1.3.0       compiler_3.5.1   pillar_1.3.0    
[26] pkgconfig_2.0.2 

Thanks for the great package!