Closed m-mohr closed 3 years ago
Maybe a scope for summary may be defined at field level because you don't always want to summarize some fields (e.g. file:checksum
) . It would also give hints to developers that want to implement functions to generate large collection or catalogs to automatically select which field should be summarized or not.
@emmanuelmathot I don't understand that. Could you give an example, please?
All fields from extensions for items are implicitly candidate for summaries in collection, right? So if you want to automate the summaries based on the item referenced (e.g. STAC API), how do you know which extension field is a valuable value for summary? I would propose to have a summary scope per field and the recommended summary type
For instance a collection summary scope per field would set
eo:cloud_cover
-> yes, stats
file:checksum
-> no
sar:product_type
-> yes, value set
sat:relative_orbit
-> yes, stats
With that scope, there is a proper reason to have stac_extensions
declared in collection
Thanks, now I understand. This is basically what issue #1004 is about. This list only specifies whether something should be summarized or not, but we could also make a recommendation on stats or value sets. Although, in most cases that should be relatively clear from the data type: number -> stats string -> value set (if you want you can check for ISO timestamps and make them stats) array -> merged value set object, boolean -> value set
This originates from https://github.com/stac-extensions/projection/issues/3
In the STAC collection spec is says:
This was added intentionally in 0.x, but might be outdated now. We added Collection scope to most extensions in rc.1/2 due to the fact that the fields can be used (and validated now) in collection assets (and item asset definitions in collection). A weak point is that we can't validate the summaries and couldn't use the schemas before for validating collections. With the newest changes to the schemas, we should be able to also add extensions to the stac_extensions array that are implemented in summaries (although no validation takes place). So I guess we should remove the wording above to make it more straightforward to implement?
Interestingly this was not ported over to Catalogs with the introduction of summaries there, so the wording is not there which makes it even more inconsistent.