r-spatial / stars

Spatiotemporal Arrays, Raster and Vector Data Cubes
https://r-spatial.github.io/stars/
Apache License 2.0
563 stars 94 forks source link

stars vs terra? #633

Open dmkaplan2000 opened 1 year ago

dmkaplan2000 commented 1 year ago

Hi,

I teach a small class on GIS in R and I have been struggling with how to advise my students on whether to use stars or terra when working with raster data. I like stars as it follows a logic similar to that of sf, but I have noticed that the raster package itself appears to be pushing future users towards terra as a replacement of raster (and to a lesser degree for sp instead of sf). I imagine you are not neutral on this question, but I was wondering if you had any thoughts about what the future evolution of these two packages will be. Are they both expected to continue parallel independent development? Are they expected to have different niches or simply be alternative ways of doing similar things? Which of the two tools is best for students to learn?

Thanks, David

edzer commented 1 year ago

Thanks for you question, I understand your concern: it would have been easier if there would have been one package instead of two, or if the two packages did have no overlap.

The goal of stars has always been to cover data cubes including raster data cubes of dimension higher than three (e.g. time series of multispectral remote sensing images) and vector data cubes, array data where one or more dimensions are associated with vector geometries, such as time series for in situ sensors or time-varying O-D matrices. I think terra does not cover those. Because the "raster layer" or "raster stack" are a special case of the raster data cube, they're included in stars.

Where stars is written entirely in R and S3 (and has a limited amount of C++ code, hosted in sf, to interface with GDAL), terra is written mostly in C++ and S4. This gives terra a performance benefit, but also makes the code harder to find and the datastructures more opaque and harder to understand. If you want fast code that "just works", maybe use terra, if you want to teach with a package that creates transparent R objects and/or if you want the students to be able to see what happens mostly in R, or easily modify the package, maybe use stars. stars has further support for affine, rotated, rectilinear and curvilinear rasters, see e.g. #632.

One of my goals in spatial data science is to make large data cube archives such those from as ERA5, CMIP6 and the various Copernicus or Landsat missions easier available to a wider group of interdisciplinary data scientists. Besides stars we have initiated or contributed to openEO, STAC and gdalcubes. Part of this work aims at bridging the gap between the modelling community that uses NetCDF, HDF5 or Zarr, and the geospatial GIS community that uses GeoTIFF and GPKG; some of that effort went into stars using GDAL's multidimensional array interface to read and write zarr files, another activity is co-developing s2.

stars' future evolution will follow these goals; for terra's evolution we'd have to ask @rhijmans . Although Robert and I don't meet often, I believe that we take a lot of inspiration from eachother's work.

dmkaplan2000 commented 1 year ago

Thanks for this thoughtful and very helpful response. These different focuses make a lot of sense. As I said, I personally like that stars builds off of clearly identifiable and logically constructed primitives found in sf and base R - it just makes sense to me in a way that the raster package and its children never did. So I am happy to keep using it and recommending it, I just wanted to make sure I wasn't hitching my horse to the wrong carriage.

Cheers, David