bcgov / rems

An R package to access data from British Columbia's Environmental Monitoring System
Apache License 2.0
19 stars 5 forks source link

unable to download historic data #34

Closed sebdalgarno closed 5 years ago

sebdalgarno commented 5 years ago

Two of us at Poisson Consulting have not been able to successfully download the historic database using the download_historic_data(). I've tried 4 or 5 times and it always errors out at 23% downloaded...which leads me to think maybe it's not just our bad Haida Gwaii internet. Can you confirm that it is possible to download using that function?

boshek commented 5 years ago

FWIW I just tried it on my machine and it downloads fine and creates the db fine as well. 🤷‍♂️

sebdalgarno commented 5 years ago

OK thanks very much for checking it out. Must be an internet issue on my end. Haida Gwaii life!

ateucher commented 5 years ago

This has happened to me and others before, and I think it was a server configuration issue... I'll look into it.

ateucher commented 5 years ago

@sebdalgarno can you please post the error code you are getting, as well as your sessionInfo()?

sebdalgarno commented 5 years ago
> download_historic_data()
rems would like to store a copy of the historic ems data at /Users/sebastiandalgarno/Library/Application Support/rems/ems.sqlite. Is that okay? 

1: Yes
2: No

Selection: 1
This is going to take a while...
Downloading latest 'historic' EMS data from BC Data Catalogue (url:https://pub.data.gov.bc.ca/datasets/949f2233-9612-4b06-92a9-903e817da659/ems_sample_results_historic_expanded.csv)
  |========================                                                                                 |  23%Error in curl::curl_fetch_disk(url, x$path, handle = handle) : 
  transfer closed with 3630177949 bytes remaining to read

> traceback()
7: curl::curl_fetch_disk(url, x$path, handle = handle)
6: request_fetch.write_disk(req$output, req$url, handle)
5: request_fetch(req$output, req$url, handle)
4: request_perform(req, hu$handle$handle)
3: httr::GET(url, httr::write_disk(tfile), httr_progress())
2: download_ems_data(url)
1: download_historic_data()

> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats     graphics  utils     datasets  grDevices methods   base     

other attached packages:
[1] rems_0.4.0.9999 slackr_1.4.2    drat_0.1.4      testthat_2.0.1  usethis_1.4.0   devtools_2.0.1 

loaded via a namespace (and not attached):
 [1] storr_1.2.1       tidyselect_0.2.5  remotes_2.0.2     purrr_0.3.1       colorspace_1.4-0  vctrs_0.1.0.9002 
 [7] yaml_2.2.0        blob_1.1.1.9000   rlang_0.3.1       pkgbuild_1.0.2    pillar_1.3.1      glue_1.3.1       
[13] withr_2.1.2       DBI_1.0.0         rappdirs_0.3.1    bit64_0.9-7       sessioninfo_1.1.1 plyr_1.8.4       
[19] stringr_1.4.0     munsell_0.5.0     gtable_0.2.0      memoise_1.1.0     callr_3.2.0       ps_1.3.0         
[25] curl_3.3          Rcpp_1.0.0        readr_1.3.1       scales_1.0.0      backports_1.1.3   desc_1.2.0       
[31] pkgload_1.0.2     jsonlite_1.6      fs_1.2.6          bit_1.1-14        ggplot2_3.1.0     hms_0.4.2        
[37] digest_0.6.18     stringi_1.3.1     processx_3.3.0    dplyr_0.8.0.1     grid_3.5.3        rprojroot_1.3-2  
[43] cli_1.1.0         tools_3.5.3       magrittr_1.5      lazyeval_0.2.1    tibble_2.0.1      RSQLite_2.1.1    
[49] crayon_1.3.4      pkgconfig_2.0.2   zeallot_0.1.0     xml2_1.2.0        prettyunits_1.0.2 assertthat_0.2.0 
[55] httr_1.4.0        rstudioapi_0.10   R6_2.4.0          compiler_3.5.3 
boshek commented 5 years ago

And for me (where it worked) on a windows machine:

- Session info ----------------------------------------------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.5.3 (2019-03-11)
 os       Windows >= 8 x64            
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                        
 collate  English_Canada.1252         
 ctype    English_Canada.1252         
 tz       America/Los_Angeles         
 date     2019-03-27                  

- Packages --------------------------------------------------------------------------------------------------------------------------------------------
 package     * version    date       lib source                     
 assertthat    0.2.1      2019-03-21 [1] CRAN (R 3.5.2)             
 backports     1.1.3      2018-12-14 [1] CRAN (R 3.5.2)             
 bit           1.1-14     2018-05-29 [1] CRAN (R 3.5.2)             
 bit64         0.9-7      2017-05-08 [1] CRAN (R 3.5.2)             
 blob          1.1.1      2018-03-25 [1] CRAN (R 3.5.3)             
 callr         3.2.0      2019-03-15 [1] CRAN (R 3.5.3)             
 cli           1.1.0      2019-03-19 [1] CRAN (R 3.5.3)             
 clisymbols    1.2.0      2017-05-21 [1] CRAN (R 3.5.3)             
 commonmark    1.7        2018-12-01 [1] CRAN (R 3.5.3)             
 crayon        1.3.4      2017-09-16 [1] CRAN (R 3.5.3)             
 curl          3.3        2019-01-10 [1] CRAN (R 3.5.3)             
 DBI           1.0.0      2018-05-02 [1] CRAN (R 3.5.3)             
 desc          1.2.0      2018-05-01 [1] CRAN (R 3.5.3)             
 devtools    * 2.0.1      2018-10-26 [1] CRAN (R 3.5.3)             
 digest        0.6.18     2018-10-10 [1] CRAN (R 3.5.3)             
 dplyr         0.8.0.1    2019-02-15 [1] CRAN (R 3.5.3)             
 fs            1.2.7      2019-03-19 [1] CRAN (R 3.5.3)             
 git2r         0.25.2     2019-03-19 [1] CRAN (R 3.5.3)             
 glue          1.3.1      2019-03-12 [1] CRAN (R 3.5.3)             
 hms           0.4.2      2018-03-10 [1] CRAN (R 3.5.3)             
 httr          1.4.0      2018-12-11 [1] CRAN (R 3.5.2)             
 hunspell      3.0        2018-12-15 [1] CRAN (R 3.5.3)             
 jsonlite      1.6        2018-12-07 [1] CRAN (R 3.5.3)             
 knitr         1.22       2019-03-08 [1] CRAN (R 3.5.3)             
 lobstr      * 1.0.1      2018-12-21 [1] CRAN (R 3.5.3)             
 magrittr      1.5        2014-11-22 [1] CRAN (R 3.5.3)             
 memoise       1.1.0      2017-04-21 [1] CRAN (R 3.5.3)             
 parsedate     1.1.3      2017-03-02 [1] CRAN (R 3.5.3)             
 pillar        1.3.1      2018-12-15 [1] CRAN (R 3.5.3)             
 pkgbuild      1.0.3      2019-03-20 [1] CRAN (R 3.5.3)             
 pkgconfig     2.0.2      2018-08-16 [1] CRAN (R 3.5.3)             
 pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.5.3)             
 prettyunits   1.0.2      2015-07-13 [1] CRAN (R 3.5.3)             
 processx      3.3.0      2019-03-10 [1] CRAN (R 3.5.3)             
 ps            1.3.0      2018-12-21 [1] CRAN (R 3.5.3)             
 purrr         0.3.2      2019-03-15 [1] CRAN (R 3.5.3)             
 R6            2.4.0      2019-02-14 [1] CRAN (R 3.5.3)             
 randquotes    0.1.0      2018-05-11 [1] CRAN (R 3.5.3)             
 rappdirs      0.3.1      2016-03-28 [1] CRAN (R 3.5.3)             
 rcmdcheck     1.3.2      2018-11-10 [1] CRAN (R 3.5.3)             
 Rcpp          1.0.1      2019-03-17 [1] CRAN (R 3.5.3)             
 readr         1.3.1      2018-12-21 [1] CRAN (R 3.5.3)             
 rematch       1.0.1      2016-04-21 [1] CRAN (R 3.5.3)             
 remotes       2.0.2      2018-10-30 [1] CRAN (R 3.5.3)             
 rems          0.4.0.9999 2019-03-27 [1] Github (bcgov/rems@ac34dbd)
 rhub          1.1.0      2019-03-25 [1] CRAN (R 3.5.3)             
 rlang         0.3.2      2019-03-21 [1] CRAN (R 3.5.3)             
 rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.5.3)             
 RSQLite       2.1.1      2018-05-06 [1] CRAN (R 3.5.3)             
 rstudioapi    0.10       2019-03-19 [1] CRAN (R 3.5.3)             
 rversions     1.0.3      2016-08-02 [1] CRAN (R 3.5.3)             
 sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.5.3)             
 spelling      2.1        2019-03-11 [1] CRAN (R 3.5.3)             
 storr         1.2.1      2018-10-18 [1] CRAN (R 3.5.3)             
 stringi       1.4.3      2019-03-12 [1] CRAN (R 3.5.3)             
 stringr       1.4.0      2019-02-10 [1] CRAN (R 3.5.3)             
 testthat    * 2.0.1      2018-10-13 [1] CRAN (R 3.5.3)             
 tibble        2.1.1      2019-03-16 [1] CRAN (R 3.5.3)             
 tidyselect    0.2.5      2018-10-11 [1] CRAN (R 3.5.3)             
 usethis     * 1.4.0      2018-08-14 [1] CRAN (R 3.5.3)             
 uuid          0.1-2      2015-07-28 [1] CRAN (R 3.5.2)             
 whoami        1.3.0      2019-03-19 [1] CRAN (R 3.5.3)             
 withr         2.1.2      2018-03-15 [1] CRAN (R 3.5.3)             
 xfun          0.5        2019-02-20 [1] CRAN (R 3.5.3)             
 xml2          1.2.0      2018-01-24 [1] CRAN (R 3.5.3)             
 xopen         1.0.0      2018-09-17 [1] CRAN (R 3.5.3)             
 yaml          2.2.0      2018-07-25 [1] CRAN (R 3.5.2)       
ateucher commented 5 years ago

I'm not really sure why this worked for @boshek, as it seems there was a 4GB file download limit set on the server, and this file is > 4GB. They have tweaked the server settings, and it should take effect overnight, so please try again tomorrow @sebdalgarno.

sebdalgarno commented 5 years ago

thanks @ateucher ! I'll try again tomorrow

sebdalgarno commented 5 years ago

just tried again with same error (again, at 23%). For now I am working around this by downloading the csv and using rems:::save_historic_data() to generate the database and store at the path generated by rems:::write_db_path(). Do you see any issue with that approach?

ateucher commented 5 years ago

I think that approach should work, though obviously not ideal. Are you downloading the csv through your browser? It's surprising to me that that works, but not through the package...

ateucher commented 5 years ago

@sebdalgarno it is working for me now, both through the package, and on the command line with:

$ curl https://pub.data.gov.bc.ca/datasets/949f2233-9612-4b06-92a9-903e817da659/ems_sample_results_historic_expanded.csv --output foo.csv
sebdalgarno commented 5 years ago

Should clarify that - I'm still unable to download the csv, but my colleague downloaded it successfully a few weeks ago so am using that copy.

sebdalgarno commented 5 years ago

I will close this for now. I am building a package that runs a shiny app to access the database (making heavy use of the rems package). It will require the user to first download the historical database, so if my users run into issues I will re-open.