Closed Bisaloo closed 2 months ago
I now wonder if this is worth the overhead given that one can just do (from the vignette)
peru_survey <- get_survey("https://doi.org/10.5281/zenodo.1095664")
saveRDS(peru_survey, "peru.rds")
and later future
peru_survey <- readRDS("peru.rds")
or alternatively via #61
peru_files <- download_survey("https://doi.org/10.5281/zenodo.1095664", dir = "Surveys")
peru_survey <- load_survey(peru_files)
saveRDS(peru_files, file.path("Surveys", "peru_files.rds"))
and later
peru_files <- readRDS(file.path("Surveys", "peru_files.rds"))
peru_survey <- load_survey(peru_files)
which also enables inspection/use of the raw csv files in "Surveys".
I think it depends on the position you're taking:
from the user point of view, you're entirely right, they could "manually cache" the results if they wish.
from the server point of view (zenodo.org), we're hitting them with unnecessary requests to get the same result over and over. It would be more polite to cache repeated requests. I believe it's especially important in this case because we're using webscraping and not an official API, which is usually better set up to handle automated requests.
As @sbfnk mentioned, it would be useful (and polite) to cache surveys downloaded from zenodo instead of re-downloading them each time the code is re-run.
The simplest option is probably to use memoise but there are also other tools specific to http resources (e.g., https://github.com/sckott/webmiddens) so open to discussion / suggestions