Closed jflowernet closed 4 weeks ago
Hi @jflowernet
Thank you for the nice words and apologies for the late reply (I didn't get a notification for some reason). Do you have a reproducible example? It is odd as this package is just a lightweight wrapper of rerddap
, which already nicely deals with caching.
See reprex below:
library(biooracler)
# Download average air temperature data
dataset_id = "tas_baseline_2000_2020_depthsurf"
variables = c("tas_mean")
# Decade 2000-2010
time = c('2001-01-01T00:00:00Z', '2010-01-01T00:00:00Z')
# Select northern hemisphere
latitude = c(0, 89.975)
longitude = c(-179.975, 179.975)
# Set up constraints
constraints = list(time, latitude, longitude)
names(constraints) = c("time", "latitude", "longitude")
# Make sure cache is purged
rerddap::cache_delete_all()
# Perform download as netcdf ~25 seconds
system.time({
layer <- download_layers(dataset_id, variables, constraints, fmt = "nc")
})
#> Selected dataset tas_baseline_2000_2020_depthsurf.
#> Dataset info available at: http://erddap.bio-oracle.org/erddap/griddap/tas_baseline_2000_2020_depthsurf.html
#> Selected 1 variables: tas_mean
#> user system elapsed
#> 3.889 1.112 24.851
# Check cached files
rerddap::cache_list()
#> <rerddap cached files>
#> NetCDF files:
#> c31e29ee465de830a8bd07e7512fdf48.nc
#> CSV files:
# Check cache path
rerddap::cache_info()
#> $path
#> [1] "/tmp/RtmpbyBpQW/R/rerddap"
#>
#> $no_files
#> [1] 1
# Get path to cached file
layer_path <- rerddap::cache_details(layer)[[1]]$info$filename
file.exists(layer_path)
#> [1] TRUE
# Download one more time - note the execution time drops to ~1 second
system.time({
layer <- download_layers(dataset_id, variables, constraints, fmt = "nc")
})
#> Selected dataset tas_baseline_2000_2020_depthsurf.
#> Dataset info available at: http://erddap.bio-oracle.org/erddap/griddap/tas_baseline_2000_2020_depthsurf.html
#> Selected 1 variables: tas_mean
#> user system elapsed
#> 0.520 0.418 1.095
Created on 2024-10-18 with reprex v2.1.1
Thanks for the response. It's good to know that biooracler
uses rerddap
under the hood.
Running your example and some of my own, I can see that the caching is working. I probably wasn't seeing much difference between a first and second call to the servers because my rerddap server calls are in a function which does other time consuming stuff.
Thanks again for biooracler
!
First, thanks for
biooracler
! It is great to be able to do spatial queries on the Bio-Oracle datasets within R.I may have missed something, but it doesn't seem like
download_layers()
checks first to see if the data has already been downloaded? This would be a great enhancement and avoid unnecessary calls to the ERDDAP sever. Themarmap
package does this by creating a filename built from the query parameters and checking to see if that file already exists before executing the data download, see here.