stac-extensions / mlm

STAC Machine Learning Model (MLM) Extension to describe ML models, their training details, and inference runtime requirements.
https://stac-extensions.github.io/mlm/
Apache License 2.0
1 stars 0 forks source link

Revising supported Norm types in the norm enum to more commonly used norm types #24

Open fmigneault opened 3 hours ago

fmigneault commented 3 hours ago

@rbavery cloned issue crim-ca/mlm-extension#26 on 2024-06-21:

:rocket: Feature Request

We revise the norm type enum to remove these norm types


inf

l1

l2

l2sqr

hamming

hamming2

type-mask

relative

and instead keep the others and add max_norm

I think we should also provide guidance on what Statistics to provide for each norm type in the enum, accounting for clip and max norm.

:sound: Motivation

the norm types currently specified in the enum are as follows


min-max

z-score

l1

l2

l2sqr

hamming

hamming2

type-mask

relative

inf

clip

I think some of these norms that change the shape of the array/tensor inputs are uncommon when preparing inputs for machine learning models applied to overhead imagery for the computer vision tasks we identify in the task enum. Maybe these are more relevant for regression models or other kinds of models where it isn't important to preserve the input shape. Do you think it is worth keeping these @fmigneault ?


inf

l1

l2

l2sqr

hamming

hamming2

type-mask

relative

I've only encountered these norms in my work, which is biased toward computer vision tasks (detection, segmentation, classification)


min-max

z-score

clip

additionally I've encountered and am using max normalization, where each image is divided by the maximum. This isn't currently accounted for in the enum so I suggest we add it with the key


max

I'd like to specify this so that I know to divide model inputs by 255.

I think we had this enum in a previous version as max_norm but it was removed at some point.

:satellite: Alternatives

We keep this list as is and I use the Processing expression to represent max norm. This is less than ideal I think since I think the max norm seems relatively common (more common than l1, l2, hamming norms) for many of the tasks we describe in the task enum.

:paperclip: Additional context

If we agree that we should remove the inf, l1, l2, etc. norms from the enum, I think we should also remove the reference to OpenCV norm types.

This brings up the question, do we need to refer to an external authoritative source that lists common norm types? Or can we be that authoritative list? I haven't found any list more comprehensive than OpenCV's. Some libs like kornia, torchvision, R's RStoolbox provide a normalize function that assumes z-score normalization.

fmigneault commented 3 hours ago

@fmigneault commented on 2024-06-21:

remove these norm types

I do not see any reason to do this. Even if they are not common, they are valid values. I prefer to have potentially matching names that could give a hint for a user, allowing more chances to have matching types.

Note that these names are only inspired from the OpenCV reference mentioned in https://github.com/crim-ca/mlm-extension?tab=readme-ov-file#normalize-enum. However, this Enum is not restrictive. Virtually any value can be used at anyone's discretion (see https://github.com/crim-ca/mlm-extension/blob/6338f4d25867222509efe06807b2094eb2725f8b/json-schema/schema.json#L572-L592). The enum names are only suggestions.

I think we should also provide guidance on what Statistics to provide for each norm type in the enum, accounting for clip and max norm.

Fully agree. As described in https://github.com/crim-ca/mlm-extension?tab=readme-ov-file#normalize-enum, some guidance is given for the very common min-max and z-score cases, but others could be provided as well.

Do you think it is worth keeping these

Yes. I think it is very important to make the extension as flexible and reusable as possible. We must not limit our cases to computer vision, classification, segmentation, etc. For example, working with "pixels" might make no sense for a model working with climate data represented by simple geolocated data points. The normalization might not even be 2D or pixel-wise operations.

I suggest we add it with the key max

I'd like to specify this so that I know to divide model inputs by 255.

No objection. Maybe a bit redundant though?

Wouldn't it be equivalent to min-max: 0-255 for a unit8, or whichever equivalent for unsigned/uint16/etc.?

If we start accumulating a few "required statistics" when a specific normalization type is specified, maybe we should start formatting them better in a table under the "Normalization Enum" section to make things more easily readable.

Maybe we should consider explicit validations in the JSON-schema for those cases as well since they are well-known and would be well-defined in the specification.

This brings up the question, do we need to refer to an external authoritative source that lists common norm types?

No, we don't need to. However, the first question a user would have if they see an unknown name is "what does it do". Therefore, I prefer to leave at least some reference of what it means rather than leaving it up to interpretation. That being said, any additional/alternative references could be reported as well. I used OpenCV only because that's the easiest single-link reference that I could find providing all the normalization techniques mentioned and their corresponding descriptions and formulas.

Or can we be that authoritative list?

I personally prefer to stay away from this to avoid the maintenance burden. OpenCV's list is already complicated enough to replicate. I wouldn't want to have to deal with this complexity on our end as well.

Some libs like kornia, torchvision, R's RStoolbox provide a normalize function that assumes z-score normalization

Absolutely. Feel free to provide the references if those can be identified. Maybe to include in the previously mentioned table to get things more organized.

fmigneault commented 3 hours ago

@rbavery commented on 2024-06-22:

However, this Enum is not restrictive. Virtually any value can be used at anyone's discretion

I don't think this is the case? I get a validation error when supplying a value that is not in the enum.


    model_input = ModelInput(

        name="9 Band Sentinel-2 4 Time Step Series Batch",

        bands=band_names,

        input=input_array,

        norm_by_channel=True,

        norm_type="max",

        resize_type="crop",

        statistics=stats,

        pre_processing_function=ProcessingExpression(format="documentation-link", expression="https://github.com/allenai/satlas/blob/main/CustomInference.md#sentinel-2-inference-example"),

    )

ValidationError: 1 validation error for ModelInput

norm_type

  Input should be 'min-max', 'z-score', 'l1', 'l2', 'l2sqr', 'hamming', 'hamming2', 'type-mask', 'relative' or 'inf' 

[type=literal_error, input_value='max', input_type=str]

    For further information visit https://errors.pydantic.dev/2.7/v/literal_error

No objection. Maybe a bit redundant though?

Wouldn't it be equivalent to min-max: 0-255 for a unit8, or whichever equivalent for unsigned/uint16/etc.?

True, I don't mind keeping just min_max. It's just a tiny bit more overhead to specify the minimum value.

I used OpenCV only because that's the easiest single-link reference that I could find providing all the normalization techniques mentioned and their corresponding descriptions and formulas.

Yeah it does list a lot of methods compared to other libs. My concern is that these won't be the methods most folks using this extension will pay attention to, and may get confused. That OpenCV page is also missing z-score.

Also cv2.NORM_TYPE_MASK and cv2.NORM_RELATIVE are flags that modify the behavior of other norm types in cv2. So including them as equivalent norm types in the norm type enum feels out of place. Can we remove these?

https://docs.opencv.org/4.x/d2/de8/group__core__array.html#gad12cefbcb5291cf958a85b4b67b6149f

Feel free to provide the references if those can be identified. Maybe to include in the previously mentioned table to get things more organized.

Can do!

fmigneault commented 3 hours ago

@fmigneault commented on 2024-07-08:

I don't think this is the case? I get a validation error when supplying a value that is not in the enum.

Oh! Seems like I add an older definition in mind.

I thought it still behaved like the mlm:framework allowing "almost any" string.

https://github.com/crim-ca/mlm-extension/blob/6338f4d25867222509efe06807b2094eb2725f8b/json-schema/schema.json#L258-L288

Seems reasonable to add the new norm value if it cannot be fulfilled by existing methods. If it can though, it is better to avoid the partially duplicate definition (ie max as a "simplified" min-max).

including them as equivalent norm types in the norm type enum feels out of place. Can we remove these?

Good shout. Yes, let's remove them.