radiantearth / stac-spec

SpatioTemporal Asset Catalog specification - making geospatial assets openly searchable and crawlable
https://stacspec.org
Apache License 2.0
777 stars 178 forks source link

fix(examples): change epsg summary example to an array of all values #1162

Closed dwsilk closed 2 years ago

dwsilk commented 2 years ago

Related Issue(s): #

N/A

Proposed Changes:

proj:epsg should be summarised as an array of all values, not a Range Object, per stac-fields and other examples.

PR Checklist:

m-mohr commented 2 years ago

stac-fields is not normative and just a recommendation. It is totally valid to use Range Object. I agree though that it's weird to have it for a single value though. PRs must be made against dev instead of master. I changed the base of the PR, but now it includes more commits than it probably should.

emmanuelmathot commented 2 years ago

My bad, the collection examples were generated by the items aggregator function of DotNetStac. I fixed it to use single value when min/max are the same.

dwsilk commented 2 years ago

stac-fields is not normative and just a recommendation. It is totally valid to use Range Object. I agree though that it's weird to have it for a single value though.

As a real world example... we have a topographical 1:50,000 map series of georeferenced rasters that covers New Zealand, the Chatham Islands and Niue. Same map series showing same features / cartography, just different islands in the Pacific using different projections. The epsg codes involved are 2193, 3793 and 32702.

So

"proj:epsg": {
    "minimum": 2193,
    "maximum": 32702
},

makes little sense as a summary because it implies that our Collection contains epsg codes spanning a huge range, whereas

"proj:epsg": [
    2193,
    3793,
    32702
],

summarises precisely which epsg codes are in-use within the Collection and seems much more useful.

In general, we have been struggling a bit with implementing summaries because there are rarely examples or guidance on how to implement - the projection extension doesn't have a collection.json example showing this, there is no guidance in the README and then here in the STAC specification repo there are 2 different examples with 2 different implementations for summarising the same field.

Should we be working towards consistency on how summaries are implemented for specific fields or is it really just intended that as long as we pick 1 of the 3 specified methods of summarising, that's fine?

PRs must be made against dev instead of master. I changed the base of the PR, but now it includes more commits than it probably should.

Oops, I thought I did that, sorry. I have now reset my branch onto dev and force pushed to remove the additional commits.

m-mohr commented 2 years ago

I did not mean to say you need to encode everything in Range objects. The structure you choose needs to fit to the data you are summarizing. In your example a list makes sense, of course. But if you have a summary of all UTM zones, it could be encoded as a Range as it contains all codes from 32601 - 32667 (or so, haven't checked closely the exact range).

and then here in the STAC specification repo there are 2 different examples with 2 different implementations for summarising the same field.

Yes, because that's totally fine. Summaries can use the structure that suits the underlying data best.

is it really just intended that as long as we pick 1 of the 3 specified methods of summarising, that's fine?

Yes, that's the case. Sure, there are (community) recommendations (e.g. in STAC Fields package), but in principle, you can do whatever you want as long as it complies with the spec.

dwsilk commented 2 years ago

But if you have a summary of all UTM zones, it could be encoded as a Range

This works for all UTM zones but it doesn't work for all subsets of contiguous UTM zones - if a Collection extent was the Pacific Ocean, you'd have 2 separate ranges so would have to use an array of values.