Add caching for HTTP resources

Bisaloo commented 2 years ago

As @sbfnk mentioned, it would be useful (and polite) to cache surveys downloaded from zenodo instead of re-downloading them each time the code is re-run.

The simplest option is probably to use memoise but there are also other tools specific to http resources (e.g., https://github.com/sckott/webmiddens) so open to discussion / suggestions

sbfnk commented 2 years ago

I now wonder if this is worth the overhead given that one can just do (from the vignette)

peru_survey <- get_survey("https://doi.org/10.5281/zenodo.1095664")
saveRDS(peru_survey, "peru.rds")

and later future

peru_survey <- readRDS("peru.rds")

or alternatively via #61

peru_files <- download_survey("https://doi.org/10.5281/zenodo.1095664", dir = "Surveys")
peru_survey <- load_survey(peru_files)
saveRDS(peru_files, file.path("Surveys", "peru_files.rds"))

and later

peru_files <- readRDS(file.path("Surveys", "peru_files.rds"))
peru_survey <- load_survey(peru_files)

which also enables inspection/use of the raw csv files in "Surveys".

Bisaloo commented 2 years ago

I think it depends on the position you're taking:

from the user point of view, you're entirely right, they could "manually cache" the results if they wish.
from the server point of view (zenodo.org), we're hitting them with unnecessary requests to get the same result over and over. It would be more polite to cache repeated requests. I believe it's especially important in this case because we're using webscraping and not an official API, which is usually better set up to handle automated requests.

epiforecasts / socialmixr

Add caching for HTTP resources #51