tensorflow / models

Models and examples built with TensorFlow
Other
77.04k stars 45.77k forks source link

Conversion success, but detect.tflite returns weird results. #9413

Open OswinGuai opened 3 years ago

OswinGuai commented 3 years ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/tree/master/research/object_detection

2. Describe the bug

I fine-tuned an SSD-Mobilenet V2 model by Tensorflow-2.3, while converting the model to tflite by TF-Nightly. This works as I got a detect.tflite file of 11MB size. But when I use this model in my Android project, it outputs weird results, with extremely low confidences over all the 10 objects. As a comparison, I used Tensorflow-1.15 and TF-Nightly to generate a tflite model of SSD-Mobilenet V1, the result turned good.

3. Steps to reproduce

As the following description,

4. Expected behavior

I expect a detailed flow on how to generate a well working tflite model of SSD-Mobilenet V2.

5. Additional context

The bad results of SSD-Mobilenet V2, image

The good results of SSD-Mobilenet V1, image

6. System information

huberemanuel commented 3 years ago

@OswinGuai you are sure that this is a duplicate? Here the Mobilenet SSD v2 doesn't work well after conversion and in #9287 the same model works well after conversion, but the ResNet doesn't work well. I think they are the same issue, but now this functionality doesn't work with Mobilenet SSD v2. I can relate the same issue with Mobilenet SSD v2, I will make some tests with tf1 version.

OswinGuai commented 3 years ago

@huberemanuel Sorry, my mistake, they are not duplicate. Your mention is right. I tried several times among versions but only got errors or wrong predictions. May you good luck.

srjoglekar246 commented 3 years ago

@OswinGuai Can you give some pointers to your inference code? It seems like there is some difference in what the TF1 model did, vs TF2.

huberemanuel commented 3 years ago

Guys, finally got some good results with the converted tflite mobilenet SSD model. Firstly, I was trying to make some tests with tf1, but I moved back to tf2 after the newer version of tf-nightly corrected the less than 1kb tflite model size. The model prediction time was also increased, previous I was facing an issue that the original model was really faster than the tflite model, which didn't make sense (again, the new version of tf-nightly). Now speaking of qualitative results, else they maintained poor prediction results, while the original model was around 80% accurate, the tflite model was not even 10% accurate. The solution was to normalize the input image as follows:

ori_img = tf.keras.preprocessing.image.load_img(image_path, target_size=input_shape)
input_data = np.array(np.expand_dims(ori_img, 0))

# Important line
input_data = input_data / 128 - 1

After that, my tflite quantized model is presenting 80% accuracy just as the original model. Hopefully, this helps you @OswinGuai to correct your issue.

If it helps, this is my tflite converting procedure:

model = tf.saved_model.load(saved_model_dir)
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir=saved_model_dir, signature_keys=['serving_default'])
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,tf.lite.OpsSet.TFLITE_BUILTINS]
converter.experimental_new_converter = True
converter.allow_custom_ops = True
tflite_model = converter.convert()
huberemanuel commented 3 years ago

@srjoglekar246 the input normalization is a needed operation? I could not found it in the tflite example

OswinGuai commented 3 years ago

@OswinGuai Can you give some pointers to your inference code? It seems like there is some difference in what the TF1 model did, vs TF2.

@srjoglekar246 Excuse me for so late response. Here is what I tried for inferencing by TensorFlow 2.*. image

By the way, there will be some other error if I do the inference like this: image

The first way runs well, but results are wrong. The error of the second way says, Input tensor has type kTfLiteFloat32: image

OswinGuai commented 3 years ago

@huberemanuel It is cool. I will try it as soon as possible.

srjoglekar246 commented 3 years ago

@huberemanuel Yes, you need to do preprocessing as mentioned here. Could you check if there is a difference between this & your method?

huberemanuel commented 3 years ago

@srjoglekar246 the script you mentioned does the preprocessing, but I think not in a way I was trying to say, I will rephrase it. When I run the inference on an image after training with a SavedModel format, I don't actually need to preprocess the input to get the results. I believe this preprocessing step is done by the network. However, when I do this inference with a tflite I need to do this preprocessing by hand (or calling a function that does it), but the model itself doesn't take care of this step, leading to bad results if you don't normalize your image. I just found in exportf_tflite_graph_tf2.py that this normalization is indeed required, as it states "image: a float32 tensor of shape[1, height, width, 3] containing the normalized input image.", but this step is not done on the colab example script, so I think this script should be updated with this normalization (I can help with this), but more importantly, this normalization should be highlighted on the documentation.

srjoglekar246 commented 3 years ago

@huberemanuel Agreed, the Inception pre-processing is a detail many people tend to miss. I will probably get to the documentation in a while, but (if you are interested), feel free to send a PR to our documentation.

huberemanuel commented 3 years ago

Just correcting what I said earlier, the colab example is doing the preprocessing, my mistake. I will make a PR highlighting this needed procedure, thank you @srjoglekar246

srjoglekar246 commented 3 years ago

@OswinGuai Could you post your (untrained) tflite model OR pipeline config for me to take a look? And I assume you are using the SSD Android example (after modifying the app code parameters such as input size, etc if required?)

judahkshitij commented 2 years ago

Hello, I am facing the same issue. Was there any update or resolution to this issue. I posted below comment in another thread that talks about wrong results from tflite model(s).

Hello @lechwolowski, @srjoglekar246 & Others, I have been trying to run inference after converting few models from TF2 obj detection zoo to TFLite using the process described in this guide, but getting wrong results from the tflite model (I am trying basic tflite model without doing any quantization or any other optimization as a first step).

The models I have tried are:

The inference code I have used is the same as posted by @lechwolowski (also tried a few variants of the inference code that I found in other threads in this repo, but nothing worked).

I see that @lechwolowski was able to get correct results from above models except those using resnet. But for me, none of the above models are giving correct results on coco 2017 validation set images (even after tflite gets generated from the "tflite friendly" saved model). Can any of you provide insights on how you made it work or what I could be doing wrong? Any help is greatly appreciated. Thanks.

UcefMountacer commented 2 years ago

@srjoglekar246 Thank you. Normalization improved result a LOT. I just added this line: frame = frame / 255. Which I only used with my yolo tiny model.

Can you explain this line : input_data = input_data / 128 - 1 ?