Closed m-mohr closed 3 years ago
@jjrom Could you please have a look at this PR? I had to derive a bit from our discussion (see the issue for some context). I'm happy to discuss this again in a short call, too.
@jjrom Do you have an example at hand that fits into the examples folder? Could be an old one that we convert to the new schema. We don't want too many/long examples in markdown to limit the length of the documents, so I'd put them somewhere in the examples folder and link to it, but the existing examples I also don't really want to touch, because I don't have a clue what values for counts etc to put in.
Perhaps we could duplicate the "collection.json" example (https://github.com/radiantearth/stac-spec/blob/master/examples/collection.json) to a "collection-extended-summaries.json" example in which we replace
"platform": [
"cool_sat2",
"cool_sat1"
]
with
"platform": {
"type":"string",
"oneOf":[
{
"const":"cool_sat1",
"count": 103489
},
{
"const":"cool_sat2",
"count": 50700
}
]
}
?
@jjrom I think we can only add it to the collection-only folder right now. The others are a full catalog, so the count would likely be 1 or 2 as there are not so many items and we also can't add two collections, because then the items would need to link to two of them.
Maybe we could duplicate, shorten and then convert https://github.com/radiantearth/stac-spec/blob/master/examples/collection-only/collection.json a bit and also add titles to epsg codes as you mentioned in the call?
@m-mohr Ok that makes sense. In this case we can duplicate https://github.com/radiantearth/stac-spec/blob/master/examples/collection-only/collection.json and update the platform property. Concerning the EPSG codes, I'm fine to add titles but only if we shorten massively the property content - for instance limits it to two EPSG codes (e.g. EPSG:4326 and EPSG:3857). Otherwise it would be hard to read
Yes, agreed! Or can you export your "best" example from your implementation, which we then try to convert to the new format? Would be a good verification that the new schema works well with your data.
I plan to update my implementation to fit the new schema next week. If it's not too late I can provide a real example by the end of next week. Otherwise we stick to the existing collection.json duplication/shortening as we discussed (?)
@jjrom I thought about an example from your current implementation. That should be easy enough to migrate "manually".
+1 on 'manual migration'. I have some time today and can try to help.
@schwehr This looks very much like it could be useful for you as you have this gee:schema thing in your catalogs, right? Any thought on the PR? Could this be useful for you, too?
Some initial thoughts while reading to catch up... I'm not familiar enough with jsonschema. And I haven't had a chance to talk it over with simonf.
enum PropertyType {
PROPERTY_TYPE_UNSPECIFIED = 0; // Something is wrong.
STRING = 1;
INT = 2;
DOUBLE = 3;
STRING_LIST = 4;
INT_LIST = 5;
DOUBLE_LIST = 6;
}
The CI fails due to https://github.com/stac-extensions/scientific/pull/2. Once the PR is merged, we need to release scientific as 1.0.1 and change the schema URLs here / in stac-spec.
I'm wondering whether we should mark this new feature 'experimental' (with the chance of breaking in a minor release)?
Put in a suggestion for units: https://github.com/json-schema-org/json-schema-vocabularies/issues/46
I'm wondering whether we should mark this new feature 'experimental' (with the chance of breaking in a minor release)?
Is there a way to make the feature an extension? Happy for it to not be namespaced and even mentioned in the main spec. But that could let us evolve it without having to cut stac-spec releases.
No, it's not really extensible, I'd say. At least the extension would have the potential to break implementations and thus it's not a good extension.
Thanks, @schwehr.
- Is there a way to specify the units?
Not standardized, but custom keywords are a thing.
- Looking at the example, it would be nice to have simple (even if contrived) examples of all of them (as much as jsonschema can support).
Not sure what exactly you are asking for?
- At least one example without a constrained set of values for string and int.
We encourage to specify what values to expect to make it more useful for users. Just specifying that the field exists (with a specific data type?) would not be a best practice IMHO. Ideally, there would be more information.
- An example that used more of the capabilities of json-schema would be good
I'm not sure STAC is the right place to go into JSON Schema details. Wouldn't it be better to just link to https://json-schema.org/learn/ ?
- We expected to have types of INT, DOUBLE, STRING, INT_LIST, DOUBLE_LIST, or STRING_LIST. I looks like we are not using the DOUBLE_LIST.
There's no direct type value for typed arrays in JSON Schema, instead you do something like:
{ type: 'array', items: { type: 'integer' } }
(INT_LIST) or { type: 'array', items: { type: 'string' } }
(STRING_LIST)
The other types would be integer
(INT), float
(DOUBLE) and string
(STRING).
Put in a suggestion for units: json-schema-org/json-schema-vocabularies#46
JSON Schema allows for custom keywords, so I'm not sure there's a need for another annotation in a validation language as you can simply define it on your own. But let's see what they think.
Is a custom type available within the actual schema being defined inside the STAC summary? In the case I'm talking about, the data isn't what's going to have the units. The schema says what the units are going to be.
- Looking at the example, it would be nice to have simple (even if contrived) examples of all of them (as much as jsonschema can support).
Not sure what exactly you are asking for?
There is little sense of the flexibility of the possibilities here. I agree that the reader should go look at the json schema spec, but the examples here are extremely narrow.
- At least one example without a constrained set of values for string and int.
We encourage to specify what values to expect to make it more useful for users. Just specifying that the field exists (with a specific data type?) would not be a best practice IMHO. Ideally, there would be more information.
Examples of non-categorical data:
This seems good for now. We can fine-tune docs and examples later.
Related Issue(s): json-schema-org/json-schema-spec#1045
Proposed Changes:
PR Checklist: