iqss-research / readme-software

Readme2: An R Package for Improved Automated Nonparametric Content Analysis for Social Science
43 stars 10 forks source link

download_wordvecs() #5

Closed rochelleterman closed 2 years ago

rochelleterman commented 2 years ago

Hey there,

I was unable to run download_wordvecs(). I received the following errors

Error in download.file(url, destfile = file.path(targetDir, filename)) : 
  download from 'http://gking-projects.iq.harvard.edu/files/glove.6B.200d.zip' failed
In addition: Warning messages:
1: In download.file(url, destfile = file.path(targetDir, filename)) :
  downloaded length 138870485 != reported length 240802656
2: In download.file(url, destfile = file.path(targetDir, filename)) :
  URL 'https://gking-projects.iq.harvard.edu/files/glove.6B.200d.zip': Timeout of 60 seconds was reached

Looking at the readme install directory, I can see the zip file but cannot unzip it. I ended up downloading manually, unzipping, and moving to the install directory. This worked.

Session Info below.

R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] readme_2.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.7            lattice_0.20-44       png_0.1-7             digest_0.6.27        
 [5] SnowballC_0.7.0       grid_4.1.0            jsonlite_1.7.2        magrittr_2.0.1       
 [9] tokenizers_0.2.1      evaluate_0.14         stringi_1.7.2         tfruns_1.5.0         
[13] rlang_0.4.11          data.table_1.14.0     whisker_0.4           Matrix_1.3-3         
[17] reticulate_1.24       rmarkdown_2.9         tools_4.1.0           xfun_0.24            
[21] yaml_2.2.1            compiler_4.1.0        base64enc_0.1-3       tensorflow_2.8.0.9000
[25] htmltools_0.5.1.1     knitr_1.33           
astrezhnev commented 2 years ago

Thanks Rochelle,

My best guess is that this is a network issue (on either end). I'm able to replicate this issue on my system only by setting the timeout to a very low number such that only a fraction of the file is downloaded and therefore corrupted.

I recommend setting the timeout option to a higher number of seconds - for example options(timeout = max(1000, getOption("timeout"))) as suggested here