openvinotoolkit / training_extensions

Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™
https://openvinotoolkit.github.io/training_extensions/
Apache License 2.0
1.14k stars 442 forks source link

Question around OTX 2.x moving mean_values and scale_values from "conversion_parameters" to "model_info" within IR XML #3898

Closed j99ca closed 2 days ago

j99ca commented 2 weeks ago

When upgrading from OTX 1.6 to 2.X I noticed the mean_values and scale_values seemed to have moved from the XML file from the "conversion_parameters" to "model_info".

Am I right and thinking this means that image normalization is not included into the exported model layers, and will need to be performed at runtime explicitly by the user when running the IR format of the model at the edge? I didn't see anything in the documentation around this.

This relates slightly to the issue I made around custom mean_values and scale_values #3827

Is there a way to ensure this exported model serializes these operations so they don't need to be done with custom I/O? We can use the OpenVINO preprocessing API but this seems cumbersome when the previous versions of the models didn't need this?

Unrelated, is there a way for classification models (after export to IR) to include the Softmaxed output directly from the model? I see that the model I am using (Mobilenet V3) returns "logits" which need to be softmaxed to turn into a score. I think most inference cases at the edge would utilize the softmax score and not the raw logits? This is a minor issue as we can use the OpenVINO API to add that conversion to the output, but it seems strange that the object detection models return scores but the classification models need additional postprocessing to generate the score

sovrasov commented 2 weeks ago

conversion_parameters is just a log of cli parameters for now deprecated ov.mo tool. It could embed mean/std values right into model, while current tool ovc suggests to use openvino's PrePostProcessor class. In OTX we don't expose the IR structure to users, so there is no documentation describing pre and post processing required to infer a model. Instead, we have ModelAPI which can seamlessly load an OV xml model generated by OTX and perform inference out of the box (on RBG uint8 numpy image). Also, ModelAPI can be used for OV model preprocessing: it inserts mean/std + resize nodes + softmax (for multiclass classification only): https://github.com/openvinotoolkit/model_api?tab=readme-ov-file#prepare-a-model-for-inferenceadapter

sovrasov commented 2 weeks ago

An advantage of ModelAPI vs custom inference methods is that it was tested together with OTX and somehow proven to generate correct predictions. At least, it can be a source of code snippets to implement a custom inference.

j99ca commented 2 weeks ago

@sovrasov I should probably ask this in that model_api repo but I see you are a contributor:

It looks like the create_model function only works with a path to the model, which presumably does the file I/O. Is there a way instead to pass bytes directly? Or support cloud and local URIs if configured? For example with ov.Core we can load cloud/local agnostic models into memory directly:

with fsspec.open(xml_uri, 'rb') as xml_f:
    with fsspec.open(bin_uri, 'rb') as bin_f:
        core = ov.Core()
        model = core.read_model(xml_f.read(), bin_f.read())
sovrasov commented 2 weeks ago

@j99ca passing bytes is possible, but it takes more lines of code. You need to create OpenvinoAdapter first:


model_bytes = open("vehicle-segmentation-g-0005.xml", "rb").read()
weights_bytes = open("vehicle-segmentation-g-0005.bin", "rb").read()
adapter = OpenvinoAdapter(core=ov.Core(), model=model_bytes, weights_path=weights_bytes)
model = Model.create_model(adapter, configuration={}, preload=True)