IQSS / dataverse-client-r

R Client for Dataverse Repositories
https://iqss.github.io/dataverse-client-r
61 stars 25 forks source link

HTTP 503 on data that used to work #130

Closed vanatteveldt closed 1 year ago

vanatteveldt commented 1 year ago

I'm trying to download the countypres_2000-2020.tab file from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ&version=9.0, but since last week I'm getting HTTP 503 errors even though manual downloading does work.

Note that you need to 'accept' the CC0 license before downloading, but this never caused an issue with dataverse before until last week (I use this in a classroom example, so I regularly have people download it).

Is there a way to solve this?

Please specify whether your issue is about:

If you are reporting (1) a bug or (2) a question about code, please supply:

Put your code here:

## load package
library("dataverse")

## code goes here
> get_dataframe_by_name("countypres_2000-2020.tab",
+                           dataset = "10.7910/DVN/VOQCHQ",
+                           server = "dataverse.harvard.edu")
Error in dataverse_search(entityId = x, type = "file", server = server,  : 
  Service Unavailable (HTTP 503).
> traceback()
8: stop(http_condition(x, "error", task = task, call = call))
7: httr::stop_for_status(r, task = httr::content(r)$message)
6: dataverse_search(entityId = x, type = "file", server = server, 
       key = key)
5: withCallingHandlers(expr, message = function(c) if (inherits(c, 
       classes)) tryInvokeRestart("muffleMessage"))
4: suppressMessages(dataverse_search(entityId = x, type = "file", 
       server = server, key = key))
3: is_ingested(fileid, ...)
2: get_dataframe_by_id(fileid, .f, original = original, ...)
1: get_dataframe_by_name("countypres_2000-2020.tab", dataset = "10.7910/DVN/VOQCHQ", 
       server = "dataverse.harvard.edu")

## session info for your system
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=nl_NL.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=nl_NL.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=nl_NL.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Amsterdam
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dataverse_0.3.13 printr_0.3      

loaded via a namespace (and not attached):
 [1] digest_0.6.31   R6_2.5.1        fastmap_1.1.1   xfun_0.39       knitr_1.42      htmltools_0.5.5 rmarkdown_2.21  xml2_1.3.3     
 [9] cli_3.6.1       compiler_4.3.1  httr_1.4.5      rstudioapi_0.14 tools_4.3.1     curl_5.0.2      evaluate_0.20   yaml_2.3.7     
[17] jsonlite_1.8.4  rlang_1.1.1  
kuriwaki commented 1 year ago

Are we sure that that dataset had a CC0 step before this? I would have thought the CC0 setup would prevent downloading (unless we set a API key).

kuriwaki commented 1 year ago

May be related to #131

kuriwaki commented 1 year ago

@vanatteveldt your code is working on my end now. Can you verify?

get_dataframe_by_name("countypres_2000-2020.tab",
                     dataset = "10.7910/DVN/VOQCHQ",
                     server = "dataverse.harvard.edu")
vanatteveldt commented 1 year ago

Yes, it works, thanks/sorry!