stac-extensions / osc

STAC Extension for the ESA Open Science Catalog
Apache License 2.0
2 stars 0 forks source link

Relation with other extensions #1

Closed m-mohr closed 1 year ago

m-mohr commented 1 year ago

A couple of questions and potential to align with other STAC resources:

Instead of e-mail it would fit better to use email as in STAC we don't usually use hyphens in field names.

constantinius commented 1 year ago

osc:type and osc:status: Maybe choose a consistent casing?

Yes, that was bugging me as well. What is the usual casing for enumerated strings in your opinion?

osc:name: What's the different with the Collection title? Maybe just use that?

We generate our STAC Collections from tabular data, where this is basically a slightly different value as from the title. It is just one piece of metadata that I did not want to loose. I don't think that this is too important to generalize.

osc:region: I've seen this a couple of times. Could this be a separate extension that gives a human readable name for the given spatial extent/geometry?

I think that this field has some re-use value. Do you have some examples on where this was used before?

osc:themes: How's this different to keywords or the subjects extension?

Good question. First I don't quite understand the difference between keywords and subjects:keywords (I guess there are none? https://github.com/stac-extensions/subjects/issues/1#issuecomment-1164553888)

Second the subjects:terms seem to be bound to geonames or wikipedia, which are both not applicable. I would be up for making this more general in the subjects extension.

I was not using it in keywords as we had to use keywords already in another context. I think, the concept of "themes" is specific enough to have a dedicated field apart from keywords.

osc:variables: If data cubes, maybe cube:variables may be interesting.

Unfortunately the semantics are different. datacube:variables are referencing variables within a netCDF file, whereas osc:variables refer to a physical one, like the concentration of a trace gas or a temperature or something similar.

osc:technical_officer: Looks similar to "providers", maybe something to align there. But maybe also interesting to get to a general "contacts" extension?

I like the concept of a contacts field similar to providers. I think it would be an abuse to use providers for personal contact information.

osc:missions: What's the difference to using "missions" from common metadata in summaries?

This could work, but I do have a gripe with it. In the spec it says:

Collections are strongly recommended to provide summaries of the values of fields that they can expect from the properties of STAC Items contained in this Collection.

But the items themselves are sometimes using data from different satellite missions to form the scientific product. So the mission property of a STAC Item would be inadequate by itself. Any suggestions on how to proceed?

osc:consortium: Similar to above, is this something for providers or so?

I guess this can be achieved using the providers. I think I will remove it from this extension.

Instead of e-mail it would fit better to use email as in STAC we don't usually use hyphens in field names.

Will do so.

m-mohr commented 1 year ago

Yes, that was bugging me as well. What is the usual casing for enumerated strings in your opinion?

It's not very consistent through the extensions (either all upper- or all lower-case), but the core spec leans towards all lower-case.

osc:region: I've seen this a couple of times. Could this be a separate extension that gives a human readable name for the given spatial extent/geometry?

I think that this field has some re-use value. Do you have some examples on where this was used before?

I can't remember all places, but for example noaa_mrms_qpe:region from https://github.com/stac-extensions/noaa-mrms-qpe seems very similar. I think a geolocation name extension would be interesting. Might be just a single field, but nevertheless.

Good question. First I don't quite understand the difference between keywords and subjects:keywords (I guess there are none? stac-extensions/subjects#1 (comment))

keywords is just like tags, it's for uncontrolled free-form "vocabularies". The subjects extension if for keywords that belong to controlled vocabularies.

I assume your themes are pre-defined somewhere so it would be more fitting to use the subjects extension, I think. subjects is still under discussion also with OGC API Records, but I think it would fit your usecase.

Second the subjects:terms seem to be bound to geonames or wikipedia, which are both not applicable

I don't think so, but maybe I'm wrong.

Unfortunately the semantics are different. datacube:variables are referencing variables within a netCDF file

No, just in general data cubes. It's not meant to be netCDF specific.

whereas osc:variables refer to a physical one, like the concentration of a trace gas or a temperature or something similar.

Okay, makes sense.

I like the concept of a contacts field similar to providers. I think it would be an abuse to use providers for personal contact information.

Yes, so a new contacts extension would be great to have, I think. It could also be a bit more advanced than what providers offers by default and follow e.g. the ISO standards with regards to fields.

osc:missions: What's the difference to using "missions" from common metadata in summaries?

This could work, but I do have a gripe with it. In the spec it says:

Collections are strongly recommended to provide summaries of the values of fields that they can expect from the properties of STAC Items contained in this Collection.

Depends on your use case, but in general you could use summaries. It's just a recommendation, which we break in openEO always for example ;-) Do you have items for the collections? Anyway, it just feels weird to not re-use an existing field that fits the purpose exactly.

But the items themselves are sometimes using data from different satellite missions to form the scientific product. So the mission property of a STAC Item would be inadequate by itself.

I'm not sure I understand, but why do the items use missions that are from the product? That doesn't make sense to me. The summary should summarize all potential missions...

constantinius commented 1 year ago

osc:region: I've seen this a couple of times. Could this be a separate extension that gives a human readable name for the given spatial extent/geometry?

I think that this field has some re-use value. Do you have some examples on where this was used before?

I can't remember all places, but for example noaa_mrms_qpe:region from https://github.com/stac-extensions/noaa-mrms-qpe seems very similar. I think a geolocation name extension would be interesting. Might be just a single field, but nevertheless.

I'm all for this. I will keep osc:region until that one arrives, I guess.


Good question. First I don't quite understand the difference between keywords and subjects:keywords (I guess there are none? https://github.com/stac-extensions/subjects/issues/1#issuecomment-1164553888)

keywords is just like tags, it's for uncontrolled free-form "vocabularies". The subjects extension if for keywords that belong to controlled vocabularies.

I assume your themes are pre-defined somewhere so it would be more fitting to use the subjects extension, I think. subjects is still under discussion also with OGC API Records, but I think it would fit your usecase.

I see, seems like it makes sense. I don't quite understand on how to exactly use the extension. Maybe @kalxas can provide some insight here.


I like the concept of a contacts field similar to providers. I think it would be an abuse to use providers for personal contact information.

Yes, so a new contacts extension would be great to have, I think. It could also be a bit more advanced than what providers offers by default and follow e.g. the ISO standards with regards to fields.

Again, @kalxas has good insight in what is provided by ISO, so we could adapt this. As with osc:region I'm up for changing the current osc:technical_officer once the contacts extension is available.


Do you have items for the collections?

There will be, but currently nothing special will be required for them and their shape will depend on the science product they are part of. Some will have extensive metadata (like datacube stuff) and others not so much.

But the items themselves are sometimes using data from different satellite missions to form the scientific product. So the mission property of a STAC Item would be inadequate by itself.

I'm not sure I understand, but why do the items use missions that are from the product? That doesn't make sense to me. The summary should summarize all potential missions...

What I mean is that a single Item can use data from various missions to form a file. So we would require a missions field on the Item instead of mission, so the summary should also be on that field. Maybe I misunderstood, but it sounds to me, that summaries are what is to be expected of the items found in that collection.

m-mohr commented 1 year ago

I'm all for this. I will keep osc:region until that one arrives, I guess.

As STAC is a community effort, it's up to you to make it arrive or it may not arrive (in time for you).

I see, seems like it makes sense. I don't quite understand on how to exactly use the extension. Maybe @kalxas can provide some insight here.

or @emmanuelmathot

Again, @kalxas has good insight in what is provided by ISO, so we could adapt this. As with osc:region I'm up for changing the current osc:technical_officer once the contacts extension is available.

Yeah, there's likely also room for alignment with OGC API - Records @pvretano. But again, someone need to lead this and as you have a need for it right now, it might be up to you to lead this.

What I mean is that a single Item can use data from various missions to form a file. So we would require a missions field on the Item instead of mission

But your extension right now has the only scope "Collections", so it's not usable in Items either.

but it sounds to me, that summaries are what is to be expected of the items found in that collection.

Usually (but not always)

kalxas commented 1 year ago

I see, seems like it makes sense. I don't quite understand on how to exactly use the extension. Maybe @kalxas can provide some insight here.

or @emmanuelmathot

OGC API - Records have two types of keywords:

Again, @kalxas has good insight in what is provided by ISO, so we could adapt this. As with osc:region I'm up for changing the current osc:technical_officer once the contacts extension is available.

Yeah, there's likely also room for alignment with OGC API - Records @pvretano. But again, someone need to lead this and as you have a need for it right now, it might be up to you to lead this.

Contacts are a big structure in ISO 19115 (heavily used in CSW) For OGC API - Records we use the providers property (A list of providers qualified by their role in association to the record.) Again see in See https://github.com/opengeospatial/ogcapi-records/blob/master/core/standard/clause_7_record.adoc#core-queryables-resource-table I would use the providers property with a "technical officer" role assigned.

m-mohr commented 1 year ago

OGC API - Records have two types of keywords:

I agree. We just need to finalize the subjects extension in STAC. For this I've asked Records how stable their spec if wrt to themes: https://github.com/opengeospatial/ogcapi-records/issues/178#issuecomment-1504972516

Contacts are a big structure in ISO 19115 (heavily used in CSW) For OGC API - Records we use the providers property (A list of providers qualified by their role in association to the record.) Again see in See https://github.com/opengeospatial/ogcapi-records/blob/master/core/standard/clause_7_record.adoc#core-queryables-resource-table I would use the providers property with a "technical officer" role assigned.

STAC is pretty strict what a provider can be. We can't extend the role to be technical officer. Also, a technical officer is not really a "provider". We need to find a way out. A contacts extension in STAC? The conflict between Records and STAC Is generally an issue. I've proposed a way out here: https://github.com/opengeospatial/ogcapi-records/issues/178#issuecomment-1504972516

So potential new extension in STAC:

emmanuelmathot commented 1 year ago

Sorry for my late comment but I totally agree with @m-mohr. Many field in this extension can be covered by others.

So potential new extension in STAC:

* subjects instead of `osc:themes`: https://github.com/stac-extensions/subjects

* contacts instead of `osc:technical_officer` and `osc:consortium`: [Contacts extension radiantearth/stac-spec#1224](https://github.com/radiantearth/stac-spec/issues/1224)

* region name instead of `osc:region`: [Human-readable "Region name" for bbox/geometry? radiantearth/stac-spec#1225](https://github.com/radiantearth/stac-spec/issues/1225)
m-mohr commented 1 year ago

Closing this, most points have been incorporated. We can discuss problems and improvements better in separate issues now.