Closed duckontheweb closed 3 years ago
cc: @ymoisan @sfoucher
cc: @HamedAlemo
@m-mohr Is there a recommended way to handle these failing tests when we haven't published the schema yet? I don't want to publish the JSON schema for v1.0.0 yet since there will be some more breaking changes to come. It looks like it won't actually prevent us from merging and publishing, it's just a bit annoying for now.
Right now, It's not failing due to the missing release, it's the markdown style check that is failing so you just need to fix the issues raised there.
But you need to change the schema URL in the package.json, too, to make CI for schemas work (replace template with ml-model). And there seem to be some more occurrences that need to be updated: https://github.com/stac-extensions/ml-model/search?q=template
Sorry, I just fixed that markdown issue. Now it's failing on the schema again:
- /home/runner/work/ml-model/ml-model/examples/item.json
-- Lint: File is well-formed
-- STAC Version: 1.0.0
--- Item: valid
--- https://stac-extensions.github.io/ml-model/v1.0.0/schema.json: -- Schema at 'https://stac-extensions.github.io/ml-model/v1.0.0/schema.json' not found. Please ensure all entries in 'stac_extensions' are valid.
Yes, please check the second paragraph of my post above for details. But in the end you'll need a valid schema and example, of course. Not sure whether that's the case yet.
Yes, please check the second paragraph of my post above for details. But in the end you'll need a valid schema and example, of course. Not sure whether that's the case yet.
Thanks, totally missed that the first time around. That did the trick!
This clarifies the meaning of core STAC spec fields as they refer to models described by the ML Model extension. In general, these fields (including spatial and temporal fields) will describe the data over which the model was trained.
~In some early discussions around how to define the meaning of these fields we had decided that they should represent the recommended usage of the model, which could be different from the training environment and data. However, many of the fields seem to be a more natural fit for the training data. In particular, when using
start_datetime
andend_datetime
in a STAC Item, we must define both (we cannot represent an open interval using these fields). This could be problematic when representing recommended usage because a publisher might want to recommend usage for something like "any imagery after 2020-01-01," which does not seem to be possible (at least in my understanding) when using these fields. The datetime range of the training data, however, should be well-known.~UPDATE 2021-10-26: Based on suggestion from @m-mohr we will in fact have these fields represent the "recommended usage" of the model. We will represent open intervals by recommending that publishers use the maximum value (
"9999-12-31T23:59:59Z"
) forend_datetime
in this case.