ajdamico / lodown

locally download and prepare publicly-available microdata
GNU General Public License v3.0
97 stars 47 forks source link

update urls, switch from cachaca to download.file #158

Closed hjacobcarlson closed 4 years ago

hjacobcarlson commented 4 years ago

I think this should fix the changed URLs and the problems with using cachaca on the census website.

But, there's still something wrong with the SAScii that I don't understand. So in that sense the pull request is incomplete.

Below is the output with the traceback on the error:

> nychvs_cat <-get_catalog("nychvs",
+                          output_dir = file.path( path.expand( "~" ), "NYCHVS"))
building catalog for nychvs

> nychvs_cat <- subset(nychvs_cat, year == 2014)
> lodown( "nychvs" , nychvs_cat )
locally downloading nychvs

trying URL 'https://www2.census.gov/programs-surveys/nychvs/datasets/2014/microdata/uf_14_repwgt_occ_web.txt'
downloaded 19.3 MB

R version 3.5.3 (2019-03-11)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lodown_0.1.0   usethis_1.5.0  devtools_2.0.2

loaded via a namespace (and not attached):
 [1] colorspace_1.4-1    rprojroot_1.3-2     htmlTable_1.13.3    base64enc_0.1-3    
 [5] fs_1.2.7            rstudioapi_0.10     roxygen2_7.0.2      remotes_2.0.4      
 [9] bit64_0.9-7         fansi_0.4.0         xml2_1.2.2          splines_3.5.3      
[13] R.methodsS3_1.7.1   knitr_1.26          pkgload_1.0.2       zeallot_0.1.0      
[17] Formula_1.2-3       packrat_0.5.0       cluster_2.0.8       R.oo_1.23.0        
[21] readr_1.3.1         compiler_3.5.3      httr_1.4.1          backports_1.1.5    
[25] assertthat_0.2.1    Matrix_1.2-17       lazyeval_0.2.2      survey_3.36        
[29] cli_2.0.0           acepack_1.4.1       htmltools_0.4.0     prettyunits_1.0.2  
[33] tools_3.5.3         gtable_0.3.0        glue_1.3.1          dplyr_0.8.0.1      
[37] Rcpp_1.0.3          xopen_1.0.0         vctrs_0.2.0         xfun_0.11          
[41] stringr_1.4.0       ps_1.3.0            testthat_2.1.0      rvest_0.3.5        
[45] lifecycle_0.1.0     XML_3.98-1.20       scales_1.1.0        hms_0.5.2          
[49] RColorBrewer_1.1-2  curl_4.3            memoise_1.1.0       SAScii_1.0         
[53] gridExtra_2.3       ggplot2_3.2.1       rcmdcheck_1.3.2     rpart_4.1-15       
[57] latticeExtra_0.6-28 stringi_1.4.3       RSQLite_2.1.4       desc_1.2.0         
[61] checkmate_1.9.4     pkgbuild_1.0.3      rlang_0.4.2         pkgconfig_2.0.3    
[65] bitops_1.0-6        lattice_0.20-38     purrr_0.3.3         htmlwidgets_1.5.1  
[69] bit_1.1-14          processx_3.4.0      tidyselect_0.2.5    magrittr_1.5       
[73] R6_2.4.1            Hmisc_4.3-0         DBI_1.0.0           pillar_1.4.2       
[77] haven_2.2.0         foreign_0.8-71      withr_2.1.2         survival_2.44-1.1  
[81] RCurl_1.95-4.12     nnet_7.3-12         tibble_2.1.3        crayon_1.3.4       
[85] grid_3.5.3          data.table_1.12.8   blob_1.2.0          callr_3.3.0        
[89] forcats_0.4.0       digest_0.6.23       R.utils_2.9.2       munsell_0.5.0      
[93] mitools_2.4         sessioninfo_1.1.1  

lodown is now exiting unexpectedly.
websites that host publicly-downloadable microdata change often and sometimes those changes cause this software to break.
if the error call stack below appears to be a hiccup in your internet connection, then please verify your connectivity and retry the download.
otherwise, please open a new issue at `https://github.com/ajdamico/asdfree/issues` with the contents of this error call stack and also the output of your `sessionInfo()`.

[[1]]
lodown("nychvs", nychvs_cat)

[[2]]
withCallingHandlers(catalog <- load_fun(data_name = data_name, 
    catalog, ...), error = function(e) {
    print(sessionInfo())
    if (grepl("cannot allocate vector of size", e)) 
        message(memory_note)
    else if (grepl("parameter must be specified", e)) 
        message(parameter_note)
    else if (grepl("to install", e)) 
        message(installation_note)
    else {
        message(unknown_error_note)
        print(sys.calls())
    }
})

[[3]]
load_fun(data_name = data_name, catalog, ...)

[[4]]
read_SAScii(tf, cleaned.sas.script, beginline = catalog[i, "beginline"])

[[5]]
suppressWarnings(sasc <- SAScii::parse.SAScii(tf, beginline = beginline, 
    lrecl = lrecl))

[[6]]
withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning"))

[[7]]
SAScii::parse.SAScii(tf, beginline = beginline, lrecl = lrecl)

[[8]]
SAS.uncomment(SASinput, "*", ";")

[[9]]
sub(substr(SASinput[i], slash_asterisk[1], asterisk_slash[1] + 
    1), "", SASinput[i], fixed = T)

[[10]]
.handleSimpleError(function (e) 
{
    print(sessionInfo())
    if (grepl("cannot allocate vector of size", e)) 
        message(memory_note)
    else if (grepl("parameter must be specified", e)) 
        message(parameter_note)
    else if (grepl("to install", e)) 
        message(installation_note)
    else {
        message(unknown_error_note)
        print(sys.calls())
    }
}, "zero-length pattern", quote(sub(substr(SASinput[i], slash_asterisk[1], 
    asterisk_slash[1] + 1), "", SASinput[i], fixed = T)))

[[11]]
h(simpleError(msg, call))

 Error in sub(substr(SASinput[i], slash_asterisk[1], asterisk_slash[1] +  : 
  zero-length pattern 
9.
sub(substr(SASinput[i], slash_asterisk[1], asterisk_slash[1] + 
    1), "", SASinput[i], fixed = T) 
8.
SAS.uncomment(SASinput, "*", ";") 
7.
SAScii::parse.SAScii(tf, beginline = beginline, lrecl = lrecl) 
6.
withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning")) 
5.
suppressWarnings(sasc <- SAScii::parse.SAScii(tf, beginline = beginline, 
    lrecl = lrecl)) at sascii.R#16
4.
read_SAScii(tf, cleaned.sas.script, beginline = catalog[i, "beginline"]) at nychvs.R#124
3.
load_fun(data_name = data_name, catalog, ...) 
2.
withCallingHandlers(catalog <- load_fun(data_name = data_name, 
    catalog, ...), error = function(e) {
    print(sessionInfo())
    if (grepl("cannot allocate vector of size", e))  ... at lodown.R#67
1.
lodown("nychvs", nychvs_cat) 
   type year
16  occ 2014
17  vac 2014
18 pers 2014
                                                                                            full_url
16  https://www2.census.gov/programs-surveys/nychvs/datasets/2014/microdata/uf_14_repwgt_occ_web.txt
17  https://www2.census.gov/programs-surveys/nychvs/datasets/2014/microdata/uf_14_repwgt_vac_web.txt
18 https://www2.census.gov/programs-surveys/nychvs/datasets/2014/microdata/uf_14_repwgt_pers_web.txt
                                                                                           sas_ri
16 https://www2.census.gov/programs-surveys/nychvs/datasets/2014/microdata/sas_import_program.txt
17 https://www2.census.gov/programs-surveys/nychvs/datasets/2014/microdata/sas_import_program.txt
18 https://www2.census.gov/programs-surveys/nychvs/datasets/2014/microdata/sas_import_program.txt
   beginline                         output_filename case_count
16         9  /Users/jakecarlson/NYCHVS/2014/occ.rds         NA
17       561  /Users/jakecarlson/NYCHVS/2014/vac.rds         NA
18       413 /Users/jakecarlson/NYCHVS/2014/pers.rds         NA
> 
ajdamico commented 4 years ago

thanks