radiantearth / stac-spec

SpatioTemporal Asset Catalog specification - making geospatial assets openly searchable and crawlable
https://stacspec.org
Apache License 2.0
772 stars 176 forks source link

Non Polygon geometry types #44

Closed mikeskaug closed 6 years ago

mikeskaug commented 6 years ago

Great work on this! It came along at just the right time.

We just started working towards implementing a static catalogue using the STAC spec. I should start off by saying that we have non-imagery data, which I realize is not the first priority for the STAC spec. The only part of the core STAC spec which causes us problems is the requirement that the geometry be of either Polygon or MultiPolygon type.

We have data collected as point measurements, so it would make the most sense to use a GeoJSON Point as the geometry type for each Item. We could define a polygonal envelope around each point to satisfy the requirement that the geometry be a Polygon or MultiPolygon, but it would be somewhat ambiguous how we define the polygon.

joshfix commented 6 years ago

Definitely would be good to include more geometries. Do you have any bandwidth to add them to the swagger doc and submit a PR?

cholmes commented 6 years ago

Thanks Mike!

Can you describe the data a bit more? We were definitely imagining that all the 'assets' in a catalog would represent more than a single point. That they really point to some other file that people download, that is the thing they want. So I'm curious what that 'thing' is in your case - the asset that is represented by the point. And like is there a thumbnail? Something to browse? That was another requirement.

If everything of interest at the point can be represented as fields, instead of another file download, then I'm not sure STAC is appropriate. Like if it's just an observation that can be described in a normal GeoJSON / Shapefile. But I'd be super into exploring a 'spatiotemporal feature' static thing with you - represent your data online as linked geojson to be crawled. And then STAC would just be an extension of the spatiotemporal feature that includes assets & a thumbnail.

mikeskaug commented 6 years ago

The data exists as several different formats at different points in the processing, but each one is a multidimensional array associated with a single geographic point. So the 'asset' in our case is some big binary file storing the n-dimensional array. To put it in the imaging context, imagine what you would get with a hyperspectral camera with one pixel. Eventually we reach a gridded data product that can be saved as geotiff and would fit the STAC spec perfectly, but we have several levels of data before that to organize, search and manage.

I guess the thumbnail is also an issue. For the specific data I'm working on now, it's natural to represent it in an image format, although the width/height dimensions are not spatial, but in general it might be difficult to think of a thumbnail representation.

cholmes commented 6 years ago

Very cool. I think it makes sense to fit that into STAC, and I really like the idea of representing the full derivation from the gridded data product back to the multidimensional array. Though I think that could happen even if the 'point thing' with assets wasn't technically stac, but used the same constructs.

For thumbnails, the main use case is just for a user browsing them in a catalog GUI. Like I don't see people trying to place thumbnails directly on the map (and if they did they'd need a polygon anyways). You'd just get results and then scroll through, with some image instead of just a stream of text. So if it's natural to represent as an image then it could be ok.

I must admit I'm a bit torn on changing the core STAC to allow points. My big fear is that people will start to use STAC to just put in all kinds of vector data, that doesn't reference an asset. And indeed that the core becomes so flexible so that validation is almost meaningless. Maybe I should just get over my fears. And indeed the thing to really do is prioritize 'static feature collections', and have STAC be an instance of those.

Curious for others to sound in on adding 'points' to the core spec. Perhaps we could allow extensions of STAC to override the geometry requirements, so the core is polygons, but extensions are allowed. I'm not sure how to do that well though. Perhaps a 'type' could indicate geometries needed.

Regardless it'd be great for you to make a static catalog with your data, and I'd say go ahead and do it with points and try to make a 'profile' of the other fields. If you can share I would be curious to see more of the actual data you're representing and what it's used for, just to get a better picture in my head.

mikeskaug commented 6 years ago

Thanks for the feedback! I think I see why you're worried about opening up the spec to other geometries. Because of the geojson format, it's at risk of being abused to the point that it stores actual data and not just metadata. In any case, we need some way to start indexing our data and I really like the STAC idea, so we'll continue along this path and try to conform as much as possible to the spec. What did you mean by make a 'profile' of the other fields?

cholmes commented 6 years ago

Great, yeah, definitely stay in line with STAC and then it'd be great to publish what you've done - implementations will definitely drive future specs.

Sorry for the lack of context on a 'profile'. See this roadmap section for a bit more info, and can scroll down for a bit more info in the next section and 'additional profiles'. Unfortunately we don't yet have the actual 'mechanism' to have an extended profile, so I don't have much guidance on how to exactly create a profile (I may try to take a crack at one soon).

But basically, a profile would be a set of fields that extend STAC, either vendor specific or for a community. And then be able to validate / have meaning behind those fields. My hope is we start with vendor fields, and then evolve to community profiles. So like if you're doing 'ocean surface data' that has a number of special fields then you'd make a profile for it, and then other vendors could share the same fields and meaning.

So practically all I'm saying at this point is to define a JSON Schema for your data, and share that with the community. In the static catalog 'recommendations' I put another little idea of using 'prefixes' to differentiate between different profiles, but it may be overkill.

mikeskaug commented 6 years ago

Ah right. I remember seeing the eo: example before. Thanks.

cholmes commented 6 years ago

Closing this, we're tracking it in #193