radiantearth / geo-ml-model-catalog

Geospatial ML Model Catalog Spec
Apache License 2.0
52 stars 8 forks source link

Add Usage Recommendations section #18

Closed duckontheweb closed 3 years ago

duckontheweb commented 3 years ago

Adds a Usage Recommendation section for describing conditions under which a model should behave as expected.

This section is based on the outline from the original Google Doc, except that it leaves out the "Input Data" element from that overview. Finding a way of consistently representing restrictions on input data (e.g. GSD, bands, etc.) in a way that is both machine-readable and understandable to humans is important, but also a difficult problem. In the interest of pushing towards an initial release of this spec that we can begin testing, I think it might be best to tackle that problem as a separate PR so that we can have a more focused discussion.

The term "behave as expected" is admittedly a bit vague. It didn't seem appropriate to use the term "perform well" since there is nothing in the current spec that guarantees that a model "performs well" even under conditions identical to the training environment. If anyone has suggestions on how to better define this I'm open to input.

duckontheweb commented 3 years ago

Merged in the new Model Training section as well to make it easier to review these changes in the full context of the spec.

duckontheweb commented 3 years ago

@calebrob6 @batic No pressure, but if you want to leave comments on this, please do. I'll plan on merging by 4 PM EDT today if there are no pressing concerns.

batic commented 3 years ago

The section very aptly addresses where "the model will work well". In my experience, we often know also where/when the model will not perform (well, or at all). As such, perhaps the section should be extended (or re-worded) so that the information therein can also identify areas/... where the model is not to be used.

duckontheweb commented 3 years ago

The section very aptly addresses where "the model will work well". In my experience, we often know also where/when the model will not perform (well, or at all). As such, perhaps the section should be extended (or re-worded) so that the information therein can also identify areas/... where the model is not to be used.

That's a really good point. It seems like restructuring this to include "recommended" and "discouraged" sections with the same structure might be the best approach. That way a users could apply the same filter logic (just inverted) to find both recommended and discouraged conditions.

duckontheweb commented 3 years ago

It seems like restructuring this to include "recommended" and "discouraged" sections with the same structure might be the best approach.

The section has been updated in a829170faf27f073f751090433138c1059567e6d to include a field for "recommendations" and a field for "cautions." Each of these is a list of objects describing the conditions that apply to that recommendation/caution.

duckontheweb commented 3 years ago

@batic I'm going to merge this so we can cut an initial release. If you have any concerns about the structure, feel free to open an issue and we can touch things up in a future PR. Thanks for your input on this!