Discussion - Githubissues

rafaqz commented 2 years ago

This package defines a simple Extent object that gives you the bounds for each dimension covered by any spatial object - like the edges or a raster, or the min/max values for each dimension in a vector or table of points.

@evetion, this should work for the GeoInterface extent object. Although I'm wondering where we should put it...

A question I have with the design is if we can always assume we know the dimension names, and how often/where we don't know them. Currently the Extent object can hold a Tuple or a NamedTuple, to handle both cases. But I'm not sure how necessary that is.

@Tokazama if you have any input on this that would be useful at this early stage - especially to tie in with SpatioTemporalTraits.jl. The idea here is that all spatial objects have extents / bounding boxes, and we can have a generic Extent object that defines these in a standard way, returned by extent(obj). As much as possible an Extent will also have dimension names, as Symbols. Comparing named extents compares by name, ignoring order.

I guess extent will provide your spatial_order, spatial_first and spatial_last, although it doesn't distinguish the space/time difference.

Primarily this is for GeoInterface.jl and Rasters.jl and other GeoSpatial packages to be able to share extent information in a generic way. I will also use it in DimensionalData.jl.

Tokazama commented 2 years ago

I'm hoping we can separate the spatial vs temporal trait from other individual traits like this, so that they are more flexible, but I'll have to think about this some more.

evetion commented 2 years ago

I would assume we always have dimension names or packages could generate symbols on the fly.

Mixing both named and unnamed seems to complicate things, as the comparison order changes between them. We could split it off into an UnnamedExtent?

Tokazama commented 2 years ago

There was a discussion a while back in NamedDims and I was kind of pushing for NamedTuple to be used for more of the interface. There are two problems with this though:

We can't have repeating names, so if we have a default name value it doesn't work as a NamedTuple. One could argue that is a sign of an incorrect implementation, but I'd rather not get caught up in that.
We also want to support dynamically named dimensions sometimes. You could probably argue that you wouldn't need this package unless you new the names statically, but, again, I don't think it's necessarily worth the debate when a tuple would do.

Just my two cents.

rafaqz commented 2 years ago

@Tokazama thanks for the comments. Tuple probably won't be enough for geospatial data because the dimension order is not consistent, but the dimension names are. And we want interop accross different point and dimension orders - even equality of Extents is defined to ignore order because its the area we care about most.

Maybe we can get around the repeated names problem by using a Tuple and type parameter Symbols directly without using the NamedTuple? DimensionalData/Rasters.jl can also have repeated dimension names after some statistical operations (and everything still works) so thats a good point.

For dynamic name changes, in what circumstances do the names of spatial dimensions need to change but you can't rebuild a new object with new names?

Having all of this work based on named comparisons, but also compile away completely, is very nice for e.g. writing fast rasterization algorithms with arbitrary source and target dimension order.

@evetion it is easier if we assume/force dimension names. Maybe we can start with that and see if it breaks anywhere.

Tokazama commented 2 years ago

Are the dimension names used purely to determine the order

rafaqz commented 2 years ago

Not purely order, also so we know they are the same kinds of dimension. For catching errors, like not using time and spatial Z as the same thing.

You could also use it for e.g. masking or subsetting a 3d object with a 2d object, but only on the shared dimensions.

rafaqz / Extents.jl

Discussion #1