bio-oracle / biooracler

R package to access Bio-Oracle data via ERDDAP
Other
8 stars 2 forks source link

choose output file name in download_layers() #13

Open AMBarbosa opened 4 months ago

AMBarbosa commented 4 months ago

download_layers() produces cryptic filenames, and (as far as I can see) the layers inside don't carry information on what SSP they refer to. I think it would be very useful if we could choose, not only the directory, but also the file name for the output layers -- e.g., something like filename = "chl_2040_ssp585"

yuliaUU commented 3 months ago

i wrote a code that will renmae file as dataset id:

a <- biooracler::download_layers(dataset_id,
                                     variables,
                                     constraints,
                                     fmt = "nc",
                                     directory = dir)
    filename_with_ext <- basename(terra::sources(a))[1]
    file.rename(
      from = glue("{dir}/{filename_with_ext}"),
      to = glue("{dir}/{dataset_id}_{variables}.nc"))
salvafern commented 1 month ago

Hi @AMBarbosa and @yuliaUU,

Apologies for the late reply, somehow I didn't get a notification!

I definitely understand why this is important. This package is a lightweight wrapper of rerddap but dedicated solely to Bio-Oracle. The functionality however is the same as rerddap. From https://docs.ropensci.org/rerddap/articles/rerddap.html#caching

When you use griddap() or tabledap() functions, we construct a MD5 hash from the base URL, and any query parameters - this way each query is separately cached. [...]

At the moment, all downloads performed via biooracler require to add at least one constraint. That means that just renaming the file to the dataset_id is not enough, as you may have requested any arbitrary constrain, and this is not reflected in the dataset_id. This is why in rerddap they use a MD5 hash to name the files: to allow requesting any constrain but still cache every request.

My preferred solution would be to modify download_layers() to allow downloading the full netcdf file, named after the dataset_id. However, if you added any constrains, then the layer would be downloaded as in rerddap, using a MD5 hash. Would this work for you?

yuliaUU commented 2 weeks ago

yes, the solution will work for me.