Palm detectin model: tflite and openvino IR model give different ouputs

geaxgx commented 3 years ago

1. OS Ubuntu 18.04

2. OS Architecture x86_64

3. Version of OpenVINO 2021.2.185 (the one from your dockerfile)

4. Version of TensorFlow e.g. v2.4.1 (the one from your dockerfile)

9. Download URL for .tflite : https://github.com/google/mediapipe/blob/master/mediapipe/modules/palm_detection/palm_detection.tflite

Hi Pinto ! First of all, I want to thank you for your last version of tflite2tensorflow. The dockerfile will surely make users life much easier !

I have installed and run the docker image of tflite2tensorflow to convert the Mediapipe palm detection model (see link above) into Openvino IR format. This model takes 128x128 images as input, whereas the previous model took 256x256 images. When running the FP32 model on my cpu, I noticed that sometimes the palm bounding box seemed a bit off. When comparing with the output from the original tflite model, we can see the bounding boxes are not the same: Below is the output from the FP32 openvino model: output_hands_openvino_128

Below is the output from the tflite model: output_hands_tflite_128

Note that if I compare the outputs of the older 256x256 models, there are no differences between tflite and Openvino versions.

Do you have an idea of what could explain the different ouputs ? Using Netron, I can see that the new tflite model now uses Prelu and ResizeBilinear that were not used in the older model. I don't see how Prelu could cause differences in the conversion process, but ResizeBilinear may be trickier (converted into Interpolate). Do you have any thoughts about that ?

Thanks for your help! I would like to use the new model, which is much faster than the previous version.

I can send you the code to reproduce the problem if you want.

PINTO0309 commented 3 years ago

Perhaps the problem is not with PReLU or ResizeBilinear, but with the lack of PADs in Conv2D and DepthwiseConv2D. It would be very helpful if you could provide us with the verification code that you have, so that we can verify that the problem is solved. :smiley:

geaxgx commented 3 years ago

Ha ha, I am relieved you have an idea ! Here is a zip file with 2 python scripts + models. Please see the README.txt. hands.zip

PINTO0309 commented 3 years ago

I'm sorry I kept you waiting. If I reverse convert saved_model to tflite again before converting to OpenVINO, the results seem to match perfectly. Hmmm. :disappointed_relieved:

Your tflite image
My tflite image (tflite -> saved_model -> tflite)

geaxgx commented 3 years ago

Be assured that I am very grateful for your help, and I also know you are very busy, so please don't be sorry keep me waiting :-)

I am not sure to fully get what you did. You did : tflite -> saved_model -> tflite 2 . Did you do also : tflite 2 -> saved_model 2 -> Openvino ? Or it is not worth it since tflite = tflite 2 ? Does it mean the problem comes from the Openvino model optimizer that converts tensorflow to IR ?

PINTO0309 commented 3 years ago

tflite -> saved_model -> tflite 2

Yes. That's right.

Did you do also : tflite 2 -> saved_model 2 -> Openvino ?

I tried it now, but the result was the same and not good. After re-converting to tflite, the detection test is normal, so it looks like there is no problem with saved_model and tflite.

# Reverse transform tflite from JSON
$ ../flatc -o . -b schema.fbs tflite2.json

# Generating saved_model2 from tflite2
$ tflite2tensorflow \
  --model_path tflite2.tflite \
  --flatc_path ../flatc \
  --schema_path schema.fbs \
  --output_pb True

# Generate OpenVINO IR and tflite float32 from saved_model2
$ tflite2tensorflow \
  --model_path tflite2.tflite \
  --flatc_path ../flatc \
  --schema_path schema.fbs \
  --output_no_quant_float32_tflite True \
  --output_openvino_and_myriad True

Does it mean the problem comes from the Openvino model optimizer that converts tensorflow to IR ?

I can't say for sure because I haven't done enough research yet. However, based on the current results alone, there seems to be a problem with the saved_model -> OpenVINO IR conversion. (Openvino model optimizer)

PINTO0309 commented 3 years ago

It worked. Screenshot 2021-02-19 23:19:22

All I had to do was fix the align_corners in .xml from zero to one. It looks like there is a bug in the model optimizer.

<layer id="313" name="up_sampling2d_lambda/resize/ResizeBilinear_lambda/resize/ResizeBilinear" type="Interpolate" version="opset1">
    <data align_corners="0" antialias="0" axes="2,3" mode="linear" pads_begin="0" pads_end="0"/>
　↓
<layer id="313" name="up_sampling2d_lambda/resize/ResizeBilinear_lambda/resize/ResizeBilinear" type="Interpolate" version="opset1">
    <data align_corners="1" antialias="0" axes="2,3" mode="linear" pads_begin="0" pads_end="0"/>

<layer id="356" name="up_sampling2d_1_lambda/resize/ResizeBilinear_lambda/resize/ResizeBilinear" type="Interpolate" version="opset1">
    <data align_corners="0" antialias="0" axes="2,3" mode="linear" pads_begin="0" pads_end="0"/>
　↓
<layer id="356" name="up_sampling2d_1_lambda/resize/ResizeBilinear_lambda/resize/ResizeBilinear" type="Interpolate" version="opset1">
    <data align_corners="1" antialias="0" axes="2,3" mode="linear" pads_begin="0" pads_end="0"/>

I may need to modify my script to change align_corners = True and apply OpenVINO specific transformations to match the behavior of the model optimizer.

geaxgx commented 3 years ago

Great job Pinto ! Once again you impressed me ! Never I could have found your fix. When I looked at the tflite model in Netron, align_corners was set to False so it seemed coherent with the align_corners set to 0 in the IR model (?) I wonder if older version of model optimizer give the same result.

PINTO0309 commented 3 years ago

The ResizeBilinear of tflite(palm_detection.tflite) before conversion has align_corners=False. Screenshot 2021-02-19 23:46:54 Therefore, the align_corners of the converted OpenVINO IR becomes false, which is the correct behavior at first glance. It probably means that the runtime has a different behavior depending on the align_corners setting. There is a difference in the behavior of TensorFlow's align_corners and OpenVINO's align_corners. This strange behavior may only occur when converting from tflite using my script, but I think you need to keep an eye on the settings of the Resize or Upsampling operations of the saved_model from which the conversion originated. The same event may occur with older optimizers.

geaxgx commented 3 years ago

I have just applied your fix to do the test. Actually, the output on my image example is not exactly the same as the tflite output, but yet is better than the version without the fix. Do you think it is due to the fact that the Interpolate implementation of Openvino is not a one-to-one equivalent of ResizeBilinear function of tflite (for instance, I think there is no "half_pixel_centers" parameter for Interpolate) ?

geaxgx commented 3 years ago

I am currently looking at Openvino documentation. Have you seen there are 2 versions of Interpolate ? https://docs.openvinotoolkit.org/latest/openvino_docs_ops_image_Interpolate_1.html and https://docs.openvinotoolkit.org/latest/openvino_docs_ops_image_Interpolate_4.html It seems that Interpolate-4 has a _coordinate_transformationmode parameter related to half_pixel. But I have no idea on how we can choose the version we want (?) and even if it could help in my case.

PINTO0309 commented 3 years ago

Looking only at the behavioral results, maybe TensorFlow's half_pixel_centers and OpenVINO's align_corners are synonymous.

It seems that Interpolate-4 has a coordinate_transformation_mode parameter related to half_pixel. But I have no idea on how we can choose the version we want (?) and even if it could help in my case.

You're right. If we could adjust the behavior of the model optimizer, the problem would not have occurred. I don't know what the conditions are to make the model optimizer select Interpolate-4.

I have not tried this, but you may be able to correct the behavior by rewriting version="opset1" to version="opset4" and adding the missing attributes by yourself.

<layer id="312" name="up_sampling2d_lambda/resize/ResizeBilinear_lambda/resize/ResizeBilinear/Cast_112441_const" type="Const" version="opset1">
    <data element_type="i64" offset="2569092" shape="2" size="16"/>
    <output>
        <port id="1" precision="I64">
            <dim>2</dim>
        </port>
    </output>
</layer>
<layer id="313" name="up_sampling2d_lambda/resize/ResizeBilinear_lambda/resize/ResizeBilinear" type="Interpolate" version="opset1">
    <data align_corners="1" antialias="0" axes="2,3" mode="linear" pads_begin="0" pads_end="0"/>
    <input>
        <port id="0">
            <dim>1</dim>
            <dim>256</dim>
            <dim>4</dim>
            <dim>4</dim>
        </port>
        <port id="1">
            <dim>2</dim>
        </port>
    </input>
    <output>
        <port id="2" precision="FP32">
            <dim>1</dim>
            <dim>256</dim>
            <dim>8</dim>
            <dim>8</dim>
        </port>
    </output>
</layer>

geaxgx commented 3 years ago

Good idea ! I will try that.

PINTO0309 commented 3 years ago

I'm very tired and it's already midnight today, so I'll get some sleep first. I'll try it myself tomorrow if I can find the time.

geaxgx commented 3 years ago

Unfortunately I quickly realized I could not go very far just by modifying the xml file as you proposed, because Interpolate-4 takes 4 inputs (vs 2 inputs for Interpolate-1). I guess it means the bin file would not be coherent anymore with the xml file. Anyway, I did other qualitative tests that confirms your fix (align_corners="1") clearly gives better result than without the fix. I imagine we are both curious to know the reason of the different model behaviours but, knowing that you are very busy, I don't want to push you to spend too much time on trying to get exactly the same outputs as the tflite model outputs. I am already satisfied with what I have. Palm detection is just a preliminary step for the landmark model. Similar palm bounding boxes will give very similar hand landmarks. Thank you !

PINTO0309 commented 3 years ago

I will continue to work on improving the tool and will let you know as soon as I find a pattern where Interpolate-4 is applied.

PINTO0309 commented 3 years ago

Added optimizing_for_openvino_and_myriad option and upgraded to v1.3.4. Modified the script to generate OpenVINO IR with align_corners="1" (True) by converting with optimizing_for_openvino_and_myriad in Step.1.

geaxgx commented 3 years ago

Thank you Pinto ! I have just tried on my cpu and on oak-d. It is working great ! (for the oak-d, because I want to run inference on images coming from the on-board camera, I had to generate a blob with arguments different than the one used in tflite2tensorflow).

You made a great tool. Just wondering why you haven't named it "tflite2anything" ;-)

PINTO0309 commented 3 years ago

I'm very happy to hear that it went well.

You made a great tool. Just wondering why you haven't named it "tflite2anything" ;-)

Hahaha. :smile: At first, as the name of the repository suggests, I was going to implement only the features I wanted, but the number of features I wanted increased too much. As I improved the ease of use of the tools, I gradually came to want to eliminate the hassle of using multiple tools. I will continue to make enhancements in response to your requests. The mismatch with the name will get bigger.

PINTO0309 commented 3 years ago

The problem appears to have been resolved, so I will close it once. If you have other issues, please register another issue.

PINTO0309 / tflite2tensorflow