radiantearth / stac-spec

SpatioTemporal Asset Catalog specification - making geospatial assets openly searchable and crawlable
https://stacspec.org
Apache License 2.0
757 stars 180 forks source link

Datasets without time #1268

Closed renaudjester closed 3 weeks ago

renaudjester commented 5 months ago

Dear all,

We have some datasets that do not have time values and we are struggling on how to fill their metadata, as the STAC specification requires either a value (properties/datetime) or a range (properties/start_datetime, properties/end_datetime) and therefore pystac has the same requirements. We could see that it has been discussed here #792 but in our case we cannot find a proper “nominal” datetime for our datasets.

For example, some datasets are static datasets (bathymetry, coordinates) that are atemporal and that are released together with other temporal datasets in our catalog. Other cases might be aggregated statistics and indicators, which have lost the time dimension.

Would it be possible to relax the standard to not make the time properties mandatory? Otherwise, how would you suggest that we describe these datasets?

Many thanks for your help!

m-mohr commented 5 months ago

Well, this is the SpatioTemporal Asset Catalog. As such it requires certain things such as space and time references to be available. I don't think this will be weakened. As an alternative OGC API - Records is evolving which has less strct requirements. Once standardized I imagine it could be mixed with STAC so that non-SpatioTemporal entities can be described in a compatible format through OGC API - Records.

If you aggregate statistics over a certain time range, shouldn't the statistics have a refernce to the source time range? Generally, I think a lot of data actually have a temporal reference although not completely obvious. But that's just my personal view here.

renaudjester commented 4 months ago

Thanks for your answer!

In our case it's also that users can search products in our database based on those dates. Hence, there might be some datasets where a "nominal" date doesn't really make sense as a reference since the data are aggregated from other sources.

fmigneault commented 3 months ago

What about situations where the end_datetime is not known? For example, a growing item definition that gets more and more labels applied gradually as a continuous annotation effort?

Another case I have is for using ML models. Similar to what https://github.com/stac-extensions/ml-model?tab=readme-ov-file#spatiotemporal-fields describes, a max value of "9999-12-31T23:59:59Z" needs to be filled to work around the requirement, although it would make much more sense to leave the time range open-ended with end_datetime: null, similar to OGC's <start_datetime>/.. nomenclature. The same issue applies for the MLM extension (https://github.com/crim-ca/dlm-extension/pull/2).

mikemahoney218 commented 3 months ago

What about situations where the end_datetime is not known?

It seems to me like the current end_datetime is known, even if the final end_datetime isn't. Is there a reason you can't update your end_datetime when updating other labels/data?

fmigneault commented 3 months ago

Yes. For that case I agree, the end_datetime could be updated at the same time as the labels get pushed. I am more worried about users taking the other "patch" approach of simply setting a high value to work around the schema validation. I would prefer to have a way to specify an explicit representation of "undefined" than any random large value.

m-mohr commented 3 months ago

By the way, the open range case fits better into issue https://github.com/radiantearth/stac-spec/issues/1161 Generally, I wouldn't mind discussing a PR that allows open date ranges.

m-mohr commented 3 weeks ago

I think everything has been said here?!