stac-extensions / processing

Indicates from which processing chain data originates and how the data itself has been produced.
Apache License 2.0
14 stars 3 forks source link

Add a `Software` object #29

Closed gadomski closed 6 months ago

gadomski commented 1 year ago

This extension is used to describe processing history of both assets and the STAC metadata. For STAC metadata, the software that was used to create the data is often public (at least in part). We could provide more guidance/structure around how to provide that information. I haven't fully thought this through, but that might look like this:

Field Name Type Description
processing:software List<Software> Software that was used to create assets or the STAC object itself, in order from the first software used to the last

And then the Software object:

Field Name Type Description
name string REQUIRED: The name of the software package or script that was used, e.g. stactools-landsat
href string An href to the source code, e.g. https://github.com/stactools-packages/landsat. This could also be the href from a package management system, e.g. https://pypi.org/project/stactools-landsat/
version string A version identifier, e.g. v1.2.3. This could also be an SHA from a version control system, e.g. 6dcb09b5b57875f334f61aebed695e2e4193db5e
m-mohr commented 1 year ago

If you just need the href additionally, why not just provide links?

gadomski commented 1 year ago

why not just provide links?

href is not required, and it would be if we used links?

m-mohr commented 1 year ago

But if you don't have an href for a link, just don't add a Link Object to the links array? There would be no close relation between the software and the link though. I'm a bit hesistant of the breaking change...

gadomski commented 1 year ago

But if you don't have an href for a link, just don't add a Link Object to the links array? There would be no close relation between the software and the link though. I'm a bit hesitant of the breaking change...

I'm a little less worried because there will be a good forward migration path (use the current key as the name, use the current other field as the version). Your comment did make me realize that I think it should be a list, not a dictionary, and I've updated my proposal accordingly.

m-mohr commented 6 months ago

@gadomski I've added a relation type processing-software so that this can be externalized, see #32 This could e.g. link to an even more advanced version, e.g. a Pipfile.lock, package-lock.json or something custom as proposed by you. I'm not quite sure we should have this level of detail in the STAC Items itself - and it's still breaking ;-) Maybe the rel type is a good enough compromise for now?