tensorflow / models

Models and examples built with TensorFlow
Other
76.97k stars 45.79k forks source link

SSD ResNet from model zoo not working after conversion to TFLite #9287

Open lechwolowski opened 3 years ago

lechwolowski commented 3 years ago

1. The entire URL of the file you are using

https://github.com/tensorflow/models/blob/master/research/object_detection/export_tflite_graph_tf2.py

2. Describe the bug

I tried to convert these unchanged models from model zoo to tflite

After conversion ResNet models returned meaningless predictions

SSD MobileNet v2 320x320 - Working SSD MobileNet V1 FPN 640x640 - Working SSD MobileNet V2 FPNLite 320x320 - Working SSD MobileNet V2 FPNLite 640x640 - Working SSD ResNet50 V1 FPN 640x640 - Not Working SSD ResNet50 V1 FPN 1024x1024 - Not Working

3. Steps to reproduce

python ~/models/research/object_detection/export_tflite_graph_tf2.py \
    --pipeline_config_path 'path/to/model/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/pipeline.config' \
    --trained_checkpoint_dir 'path/to/model/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/checkpoint' \
    --output_directory '/output/path'
import tensorflow as tf
saved_model_dir = 'path/to/saved/model/saved_model'
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.experimental_new_converter = True
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
tflite_model = converter.convert()

Inference with just created tflite model:

import tensorflow as tf
import matplotlib
import matplotlib.pyplot as plt

import cv2
import time
import numpy as np

from PIL import Image
tf.__version__

MODEL_PATH = '/path/to/tflite/model/ssd_resnet50_v1_fpn_640x640.tflite'
def set_input_tensor(interpreter, image):
  """Sets the input tensor."""
  tensor_index = interpreter.get_input_details()[0]['index']
  input_tensor = interpreter.tensor(tensor_index)()[0]
  input_tensor[:, :] = image

def get_output_tensor(interpreter, index):
  """Returns the output tensor at the given index."""
  output_details = interpreter.get_output_details()[index]
  tensor = np.squeeze(interpreter.get_tensor(output_details['index']))
  return tensor

def detect_objects(interpreter, image, threshold):
  """Returns a list of detection results, each a dictionary of object info."""
  set_input_tensor(interpreter, image)
  interpreter.invoke()

  # Get all output details
  boxes = get_output_tensor(interpreter, 0)
  classes = get_output_tensor(interpreter, 1)
  scores = get_output_tensor(interpreter, 2)
  count = int(get_output_tensor(interpreter, 3))

  results = []
  for i in range(count):
    if scores[i] >= threshold:
      result = {
          'bounding_box': boxes[i],
          'class_id': classes[i],
          'score': scores[i]
      }
      results.append(result)
  return results
interpreter = tf.lite.Interpreter(model_path=MODEL_PATH)
interpreter.allocate_tensors()
_, HEIGHT, WIDTH, _ = interpreter.get_input_details()[0]['shape']
print(f"Height and width accepted by the model: {HEIGHT, WIDTH}")
def preprocess_image(image_path):
    img = tf.io.read_file(image_path)
    img = tf.io.decode_image(img, channels=3)
    img = tf.image.convert_image_dtype(img, tf.float32)
    original_image = img
    resized_img = tf.image.resize(img, (HEIGHT, WIDTH))
    resized_img = resized_img[tf.newaxis, :]
    return resized_img, original_image
LABEL_DICT = {
1: "person",
2: "bicycle",
3: "car",
4: "motorcycle",
5: "airplane",
6: "bus",
7: "train",
8: "truck",
9: "boat",
10: "traffic light",
11: "fire hydrant",
13: "stop sign",
14: "parking meter",
15: "bench",
16: "bird",
17: "cat",
18: "dog",
19: "horse",
20: "sheep",
21: "cow",
22: "elephant",
23: "bear",
24: "zebra",
25: "giraffe",
27: "backpack",
28: "umbrella",
31: "handbag",
32: "tie",
33: "suitcase",
34: "frisbee",
35: "skis",
36: "snowboard",
37: "sports ball",
38: "kite",
39: "baseball bat",
40: "baseball glove",
41: "skateboard",
42: "surfboard",
43: "tennis racket",
44: "bottle",
46: "wine glass",
47: "cup",
48: "fork",
49: "knife",
50: "spoon",
51: "bowl",
52: "banana",
53: "apple",
54: "sandwich",
55: "orange",
56: "broccoli",
57: "carrot",
58: "hot dog",
59: "pizza",
60: "donut",
61: "cake",
62: "chair",
63: "couch",
64: "potted plant",
65: "bed",
67: "dining table",
70: "toilet",
72: "tv",
73: "laptop",
74: "mouse",
75: "remote",
76: "keyboard",
77: "cell phone",
78: "microwave",
79: "oven",
80: "toaster",
81: "sink",
82: "refrigerator",
84: "book",
85: "clock",
86: "vase",
87: "scissors",
88: "teddy bear",
89: "hair drier",
90: "toothbrush",
91: "__background__"
}

COLORS = np.random.randint(0, 255, size=(len(LABEL_DICT), 3), 
                            dtype="uint8")
def display_results(image_path, threshold=0.3):
    # Load the input image and preprocess it
    preprocessed_image, original_image = preprocess_image(image_path)
    # print(preprocessed_image.shape, original_image.shape)

    # =============Perform inference=====================
    start_time = time.monotonic()
    results = detect_objects(interpreter, preprocessed_image, threshold=threshold)
    print(f"Elapsed time: {(time.monotonic() - start_time)*1000} miliseconds")

    # =============Display the results====================
    original_numpy = original_image.numpy()
    for obj in results:
        # Convert the bounding box figures from relative coordinates
        # to absolute coordinates based on the original resolution
        ymin, xmin, ymax, xmax = obj['bounding_box']
        xmin = int(xmin * original_numpy.shape[1])
        xmax = int(xmax * original_numpy.shape[1])
        ymin = int(ymin * original_numpy.shape[0])
        ymax = int(ymax * original_numpy.shape[0])

        # Grab the class index for the current iteration
        idx = int(obj['class_id']) + 1
        # Skip the background
        if idx >= len(LABEL_DICT):
            continue

        # draw the bounding box and label on the image
        color = [int(c) for c in COLORS[idx]]
        cv2.rectangle(original_numpy, (xmin, ymin), (xmax, ymax), 
                    color, 2)
        y = ymin - 15 if ymin - 15 > 15 else ymin + 15
        label = "{}: {:.2f}%".format(LABEL_DICT[int(obj['class_id']) + 1],
            obj['score'] * 100)
        cv2.putText(original_numpy, label, (xmin, y),
            cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)

    # return the final ima
    original_int = (original_numpy * 255).astype(np.uint8)
    return original_int
resultant_image = display_results("/path/to/example/image/apple.jpg")
Image.fromarray(resultant_image)

4. Expected behavior

I Expected SSD ResNet50 to work after converting to tflite

5. Additional context

There are some warning while running export_tflite_graph_tf2.py

6. System information

k-lyda commented 3 years ago

I have similar issue with SSD with ResNet backbone in TF 2.0. Even models from model zoo, when converted to tflite, a working, but returning random results.

AMArostegui commented 3 years ago

I believe I'm in the same situation.

Using an image dataset I create:

1) A Tf1 Object Detection model based on SSD Mobilenet v2 Quantized 300x300. Results are OK 2) A TFLite model based on 1). Results are OK; very similar to 1) 3) A Tf2 Object Detection model based on SSD Resnet50. Results are better than 1) 4) A TFLite model based on 2), using the recently published scripts export_tflite_graph_tf2 and tf-nightly==2.4.0.dev20200924. Results seem random.

Some remarkable things about the model created in step 4)

yoni commented 3 years ago

Seeing the same issue. Thanks so much for reporting this, Lech.

OswinGuai commented 3 years ago

I met the same issue. Still looking for solutions.

srjoglekar246 commented 3 years ago

Might be some issue with the export/conversion process. Will take a look.

yoni commented 3 years ago

FWIW this appears to not be a bug - just gratuitous INFO logging.

srjoglekar246 commented 3 years ago

@yoni The issue originally states that the model produces wrong results, was that the behavior you observed with the SSD ResNet model?

lwbhahahaha commented 3 years ago

Same issue here.

srjoglekar246 commented 3 years ago

@lwbhahahaha WHats the issue in your case? Are you using the same inference code as above?

kirienkomaxym commented 3 years ago

Facing the same problem using Tensorflow 2 and OD

When Im training any of the mobilenets (i.e. V2, V1, different image sized and so on) I got satisfying results. Everything works ok and successfully converts to .tflite.

But when Im training any of the ssd resnets on the same dataset (tried ssd resnet50, ssd resnet101 and ssd resnet151) I get zero results at all, no matter what I run : inference graphs, .tflite models or load the model from the checkpoint, no matter I got very low detection scores (less then 10e-2) and these results seem random.

Ive tried to decrease the learning rate, change hyperparameters but the results are still zero, while the training process shows loss about 3 to 0.5.

FredrikHolsten commented 3 years ago

Same issue here.

srjoglekar246 commented 3 years ago

I am able to reproduce this on my end. Sorry about the accuracy drop :-(.

no matter what I run : inference graphs, .tflite models or load the model from the checkpoint,

@kirienkomaxym from your comment it looks like the ResNet SSD model doesn't work in TF too(as in, before conversion to TFLite). Is that correct? If that is the case, then the bug might be with the model itself.

kirienkomaxym commented 3 years ago

@srjoglekar246 yes, it is correct, it does not work in TF.

I have tried to use both configs: one that can be downloaded from tensorflow configs directory (for example http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet152_v1_fpn_1024x1024_coco17_tpu-8.config) and one that is stored in a pre-trained checkpoint directory: http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet152_v1_fpn_1024x1024_coco17_tpu-8.tar.gz

I have rewrite each config for my configuration and I got the same 0 result.

Faiz-hmed commented 3 years ago

But when Im training any of the ssd resnets on the same dataset (tried ssd resnet50, ssd resnet101 and ssd resnet151) I get zero results at all, no matter what I run : inference graphs, .tflite models or load the model from the checkpoint, no matter I got very low detection scores (less then 10e-2) and these results seem random.

Well, with ssd_resnet101_v1_fpn_640x640, I seem to be getting accurate results with a tflite model, converted from the command line tool. [Works with dynamic Quantization of the python API as well, but takes a VERY LONG time to load the model & give inference] {A notebook doing all of that is linked down below, if you want to check it out}

image

Some code to plot detections (almost same as OP of thread) @lechwolowski

from google.colab import drive
interpreter = tf.lite.Interpreter(model_path=r"model.tflite")
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2] 

image = cv2.imread("/content/gdrive/MyDrive/obj_det/190813-dairy-crisis-origin-of-livestock-rule-dairy-farm-1-1280x720.jpg")
imH, imW, _ = image.shape
print("{}, {} are height & width of the image before model pass".format(imH, imW))

image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
input_data = cv2.resize(image_rgb,(height, width))
print("{}, {} are height & width of the image before model pass after resizing".format(*input_data.shape[:-1]))

input_data = np.expand_dims(input_data, axis=0)
input_data = np.asarray(input_data, dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()

boxes = interpreter.get_tensor(output_details[0]['index'])[0]
classes = interpreter.get_tensor(output_details[1]['index'])[0]
scores = interpreter.get_tensor(output_details[2]['index'])[0]

for i in range(len(scores)):
        if (scores[i] <= 1.0):

            # Get bounding box coordinates and draw box
            # Interpreter can return coordinates that are outside of image dimensions, need to force them to be within image using max() and min()
            ymin = int(max(1,(boxes[i][0] * imH)))
            xmin = int(max(1,(boxes[i][1] * imW)))
            ymax = int(min(imH,(boxes[i][2] * imH)))
            xmax = int(min(imW,(boxes[i][3] * imW)))

            cv2.rectangle(image, (xmin,ymin), (xmax,ymax), (10, 255, 0), 2)

            # Draw label
            object_name = labels[int(classes[i])] # Look up object name from "labels" array using class index
            label = '%s: %d%%' % (object_name, int(scores[i]*100)) # Example: 'person: 72%'
            labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.7, 2) # Get font size
            label_ymin = max(ymin, labelSize[1] + 10) # Make sure not to draw label too close to top of window
            cv2.rectangle(image, (xmin, label_ymin-labelSize[1]-10), (xmin+labelSize[0], label_ymin+baseLine-10), (255, 255, 255), cv2.FILLED) # Draw white box to put label text in
            cv2.putText(image, label, (xmin, label_ymin-7), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 0), 2) # Draw label text

from google.colab.patches import cv2_imshow
cv2_imshow(image) 

Also i'm putting a link to the colab notebook i used to run this: https://colab.research.google.com/drive/13Nlhd05781AGLStWHz1HHbj1JX1JR7ie?usp=sharing

Feel free to check it out, And Hope this helps everyone. @kirienkomaxym

One limitation is that, it does'nt seem to be quantizing to an integer model. [Don't run the integer code in the notebook]

And also, I'm trying to do the same inference with ssd_mobilenet_v1_fpn_640x640, That does'nt seem to be working, and is returning random results and detecting cars as Bananas. Hopefully, Someone can get it to work. Thanks.

aescart1 commented 3 years ago

Tried recently, not able to make it work as well on All resnet backbone for SSD model.

judahkshitij commented 2 years ago

Hello @lechwolowski, @srjoglekar246 & Others, I have been trying to run inference after converting few models from TF2 obj detection zoo to TFLite using the process described in this guide, but getting wrong results from the tflite model (I am trying basic tflite model without doing any quantization or any other optimization as a first step).

The models I have tried are:

The inference code I have used is the same as posted by @lechwolowski (also tried a few variants of the inference code that I found in other threads in this repo, but nothing worked).

I see that @lechwolowski was able to get correct results from above models except those using resnet. But for me, none of the above models are giving correct results on coco 2017 validation set images (even after tflite gets generated from the "tflite friendly" saved model). Can any of you provide insights on how you made it work or what I could be doing wrong? Any help is greatly appreciated. Thanks.

zahir2000 commented 2 years ago

I can confirm I also have the same issue. All other SSD models show results, but the ResNet ones do not produce any results when integrated into an app.

Natriumpikant commented 1 year ago

Same issue for me after converting to tflite:

Someone already has news on this issue?

JuanJoseMoralesC commented 1 year ago

SSD ResNet still not working...

harufumigithub commented 1 year ago

SSD ResNet still not working...