r-spatial / stars

Spatiotemporal Arrays, Raster and Vector Data Cubes
https://r-spatial.github.io/stars/
Apache License 2.0
560 stars 94 forks source link

bowerbird package to build example data sources #7

Closed mdsumner closed 7 years ago

mdsumner commented 7 years ago

This a heads-up about one tool to build example data sets. We can use this to build collections of a variety of file sources for intensive testing. I am happy to help anyone with this. This issue is just a general resource, it can be closed but otherwise having a general place to put links and resources is useful (perhaps the Wiki tab here?)

bowerbird is our general tool for getting data, including rasters, remote sensing, model output in NetCDF, GRIB, HDF, vector data (any format actually, though most in practice are "gridded") - it can be used to build test file sets, and will keep the collection up to date with efficient source/target comparison

https://github.com/AustralianAntarcticDivision/bowerbird

One very simple example is the daily Optimally Interpolated Sea Surface Temperature data (1981 to now) on a regular 0.25 degree grid, one file is used in this early example:

https://github.com/r-spatial/stars/issues/5

mdsumner commented 6 years ago

I'm not sure why I closed this, but now that stars:backend is live it's a good time to revisit.

Bowerbird is now very mature, a tool for building collections of files from online source, it's completely data-type and data-meaning agnostic, essentially an R wrapper around wget for "getting files" and keeping them current. Bowerbird fits in the space between "sending code to run remotely" and "obtaining data in real-time via online APIs: https://australianantarcticdivision.github.io/bowerbird/articles/bowerbird.html#overview

It does include some Sentinel data in the default configuration, these "Copernicus" altimeter data were recently updated and republished (previously we obtained these data from https://www.aviso.altimetry.fr). It should be trivial to use this entry as a template for other Sentinel data, and I can be tasked to set that up.

https://australianantarcticdivision.github.io/bowerbird/articles/bowerbird.html#cmems-global-gridded-ssh-reprocessed-1993-ongoing

We augment bowerbird's default configuration with a far larger set of data in blueant - so it'll be pretty easy to apply these systems to building up a collection for stars.

edzer commented 6 years ago

Thanks! Does it deal well with object storage, say, S3 on AWS? I found out that NetCDF files are pretty much impossible to read from an S3 bucket.

mdsumner commented 6 years ago

I don't know, something I need to explore. We did discuss a little here https://twitter.com/snarkyboojum/status/942289336631832577?ref_src=twcamp%5Ecopy%7Ctwsrc%5Eandroid%7Ctwgr%5Ecopy%7Ctwcon%5E7090%7Ctwterm%5E1