ShabbirMarfatiya commented 3 months ago

I'm facing an issue while trying to run inference with a TensorFlow Lite model quantized to INT8 precision. The model was trained using CenterNet MobileNet for hand keypoint detection, and I'm getting a RuntimeError when invoking the interpreter, with the following error message:

Cell In[6], line 34
     31 # Note that CenterNet doesn't require any pre-processing except resizing to the
     32 # input size that the TensorFlow Lite Interpreter was generated with.
     33 input_tensor = tf.image.resize(input_tensor, (224, 224))
---> 34 (boxes, classes, scores, num_detections, kpts, kpts_scores) = detect(interpreter, input_tensor,include_keypoint=True)
     35 print("kpts:",scores[0])
     36 print("kpts_scores:",kpts[0][0]*image_numpy.shape[1])

Cell In[5], line 40
     38 print(input_tensor.dtype)
     39 interpreter.set_tensor(input_details[0]['index'], input_tensor)
---> 40 interpreter.invoke()
     42 scores = interpreter.get_tensor(output_details[3]['index'])
     43 boxes = interpreter.get_tensor(output_details[2]['index'])

File ~/.local/lib/python3.8/site-packages/tensorflow/lite/python/interpreter.py:923, in Interpreter.invoke(self)
    911 """Invoke the interpreter.
    912 
    913 Be sure to set the input sizes, allocate tensors and fill values before
   (...)
    920   ValueError: When the underlying interpreter fails raise ValueError.
    921 """
    922 self._ensure_safe()
--> 923 self._interpreter.Invoke()

RuntimeError: Type 'INT8' is not supported by tile.Node number 198 (TILE) failed to invoke.

Environment:

TensorFlow version: 2.7.0
TensorFlow GPU version: 2.7.0
Python version: 3.8.10
Operating System: Ubuntu 20.04

Steps to Reproduce:

Train a CenterNet MobileNet model for hand keypoint detection
Convert the trained model to TensorFlow Lite INT8 precision using the following code:
```
import tensorflow as tf
```

def parse_tfrecord_fn(example): feature_description = { 'image/encoded': tf.io.FixedLenFeature([], tf.string),

Add other features here if necessary

}
example = tf.io.parse_single_example(example, feature_description)
image = tf.io.decode_jpeg(example['image/encoded'], channels=3)
image = tf.image.resize(image, [224, 224])  # Adjust size as necessary
image = tf.cast(image, tf.float32) / 255.0  # Normalize to [0,1] if required
return image

def load_tfrecord_dataset(tfrecord_path, batch_size=1): raw_dataset = tf.data.TFRecordDataset(tfrecord_path) dataset = raw_dataset.map(parse_tfrecord_fn) dataset = dataset.batch(batch_size) return dataset

def representative_dataset(tfrecord_path, num_samples): dataset = load_tfrecord_dataset(tfrecord_path) for data in dataset.take(num_samples): yield [data]

Load the TensorFlow SavedModel

saved_model_dir = 'model_weights/tflite/saved_model' converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)

Set optimization to default for INT8 conversion

converter.optimizations = [tf.lite.Optimize.DEFAULT]

Set the representative dataset

tfrecord_path = '/home/ai_server/Shabbir/Hand_Keypoint_Detection/data/coco_testdev.record-00001-of-00050' num_samples = 100 # Adjust the number of samples as needed converter.representative_dataset = lambda: representative_dataset(tfrecord_path, num_samples)

Ensure that input and output tensors are quantized

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] converter.allow_custom_ops = True converter.inference_input_type = tf.uint8 # or tf.int8 converter.inference_output_type = tf.uint8 # or tf.int8

Convert the model

tflite_quant_model = converter.convert()

Save the quantized model

with open('centernet_int8_03.tflite', 'wb') as f: f.write(tflite_quant_model)


3. Run inference with the converted INT8 model using the following code:

import tensorflow as tf

from object_detection.utils import label_map_util from object_detection.utils import config_util from object_detection.utils import visualization_utils as viz_utils from object_detection.utils import config_util from object_detection.builders import model_builder

%matplotlib inline

Print the image we are going to test on as a sanity check.

def load_image_into_numpy_array(path): """Load an image from file into a numpy array.

Puts image into numpy array to feed into tensorflow graph. Note that by convention we put it into a numpy array with shape (height, width, channels), where channels=3 for RGB.

Args: path: a file path.

Returns: uint8 numpy array with shape (img_height, img_width, 3) """ img_data = tf.io.gfile.GFile(path, 'rb').read() image = Image.open(BytesIO(img_data)) (im_width, im_height) = image.size return np.array(image.getdata()).reshape( (im_height, im_width, 3)).astype(np.uint8)

from tensorflow.python.ops.numpy_ops import np_config np_config.enable_numpy_behavior()

def detect(interpreter, input_tensor, include_keypoint=False): """Run detection on an input image.

Args: interpreter: tf.lite.Interpreter input_tensor: A [1, height, width, 3] Tensor of type tf.float32. Note that height and width can be anything since the image will be immediately resized according to the needs of the model within this function. include_keypoint: True if model supports keypoints output. See https://cocodataset.org/#keypoints-2020

Returns: A sequence containing the following output tensors: boxes: a numpy array of shape [N, 4] classes: a numpy array of shape [N]. Note that class indices are 1-based, and match the keys in the label map. scores: a numpy array of shape [N] or None. If scores=None, then this function assumes that the boxes to be plotted are groundtruth boxes and plot all boxes as black with no classes or scores. category_index: a dict containing category dictionaries (each holding category index id and category name name) keyed by category indices. If include_keypoints is True, the following are also returned: keypoints: (optional) a numpy array of shape [N, 17, 2] representing the yx-coordinates of the detection 17 COCO human keypoints (https://cocodataset.org/#keypoints-2020) in normalized image frame (i.e. [0.0, 1.0]). keypoint_scores: (optional) a numpy array of shape [N, 17] representing the keypoint prediction confidence scores. """ input_details = interpreter.get_input_details() output_details = interpreter.get_output_details() input_tensor = (input_tensor*255).astype(np.uint8) print(input_tensor.dtype) interpreter.set_tensor(input_details[0]['index'], input_tensor) interpreter.invoke()

scores = interpreter.get_tensor(output_details[3]['index']) boxes = interpreter.get_tensor(output_details[2]['index']) num_detections = interpreter.get_tensor(output_details[5]['index']) classes = interpreter.get_tensor(output_details[0]['index'])

if include_keypoint: kpts_scores = interpreter.get_tensor(output_details[4]['index']) kpts = interpreter.get_tensor(output_details[1]['index']) return boxes, classes, scores, num_detections, kpts, kpts_scores else: return boxes, classes, scores, num_detections

Utility for visualizing results

def plot_detections(image_np, boxes, classes, scores, category_index, keypoints=None, keypoint_scores=None, figsize=(12, 16), image_name=None): """Wrapper function to visualize detections.

Args: image_np: uint8 numpy array with shape (img_height, img_width, 3) boxes: a numpy array of shape [N, 4] classes: a numpy array of shape [N]. Note that class indices are 1-based, and match the keys in the label map. scores: a numpy array of shape [N] or None. If scores=None, then this function assumes that the boxes to be plotted are groundtruth boxes and plot all boxes as black with no classes or scores. category_index: a dict containing category dictionaries (each holding category index id and category name name) keyed by category indices. keypoints: (optional) a numpy array of shape [N, 17, 2] representing the yx-coordinates of the detection 17 COCO human keypoints (https://cocodataset.org/#keypoints-2020) in normalized image frame (i.e. [0.0, 1.0]). keypoint_scores: (optional) anumpy array of shape [N, 17] representing the keypoint prediction confidence scores. figsize: size for the figure. image_name: a name for the image file. """

keypoint_edges = [(0, 1), (1, 2), (0, 3), (3, 4), (0, 5), (5, 6), (0, 7), (7, 8), (0, 9), (9, 10)] image_np_with_annotations = image_np.copy()

Only visualize objects that get a score > 0.3.

viz_utils.visualize_boxes_and_labels_on_image_array( image_np_with_annotations, boxes, classes, scores, category_index, keypoints=keypoints, keypoint_scores=keypoint_scores, keypoint_edges=keypoint_edges, use_normalized_coordinates=True, min_score_thresh=0.2) if image_name: plt.imsave(image_name, image_np_with_annotations) else: return image_np_with_annotations

Load the TFLite model and allocate tensors.

model_path = "workspace/tflite/model_6_May.tflite"

model_path = "/home/ai_server/Shabbir/DISIGN/model_weights/centernet_int8_03.tflite" label_map_path = '/home/ai_server/Shabbir/DISIGN/model_weights/label_map.pbtxt' image_path = '/home/ai_server/Shabbir/Hand_Keypoint_Detection/test/806.jpg'

image_path = '/home/shabbirmarfatiya/Shabbir/Project/ML_Tasks/Hand_Gesture_Recognition/Hand_Keypoint_Detection/FreiHAND_Dataset/test/'+str(dir_list[15])

Initialize TensorFlow Lite Interpreter.object_detection.utils import label_map_util

interpreter = tf.lite.Interpreter(model_path=model_path) interpreter.allocate_tensors()

Label map can be used to figure out what class ID maps to what

label. `label_map.txt` is human-readable.

category_index = {1: {'id': 1, 'name': 'person'}}

category_index = label_map_util.create_category_index_from_labelmap( label_map_path)

print(category_index)

label_id_offset = 1

image = tf.io.read_file(image_path) image = tf.compat.v1.image.decode_jpeg(image) image = tf.expand_dims(image, axis=0) image_numpy = image.numpy() print(image_numpy.shape)

input_tensor = tf.convert_to_tensor(image_numpy, dtype=tf.uint8)

Note that CenterNet doesn't require any pre-processing except resizing to the

input size that the TensorFlow Lite Interpreter was generated with.

input_tensor = tf.image.resize(input_tensor, (224, 224)) (boxes, classes, scores, num_detections, kpts, kpts_scores) = detect(interpreter, input_tensor,include_keypoint=True) print("kpts:",scores[0]) print("kpts_scores:",kpts[0][0]*image_numpy.shape[1]) print("Boxes:", boxes) print("classes:", classes) print("num_detections:", num_detections)

print("kpts_scores:",kpts[0][0])

vis_image = plot_detections( image_numpy[0], boxes[0], classes[0].astype(np.uint32) + label_id_offset, scores[0], category_index, keypoints=kpts[0], keypoint_scores=kpts_scores[0])

plt.figure(figsize = (15, 10)) plt.imshow(vis_image)



Expected Behavior:
The TensorFlow Lite INT8 model should run inference successfully without any errors.

Actual Behavior:
The RuntimeError is raised when invoking the interpreter, indicating that the tile operation is not supported in INT8 precision.

Additional Information:
I've followed the recommended steps for INT8 quantization, including setting the optimization flags, providing a representative dataset, and setting the target specs for INT8 operations. However, the issue persists.

@srjoglekar246 @PINTO0309 @NobuoTsukamoto @aeozyalcin @mpa74, Could you please assist me in resolving this issue? I would greatly appreciate any guidance or suggestions.

laxmareddyp commented 3 months ago

Hi @ShabbirMarfatiya ,

Thanks for filing an issue, Could you please file a bug in the TensorFlow where you can get the issue resolution faster.It is not related to the Model Garden models.

Thanks.

ShabbirMarfatiya commented 3 months ago

Hi @laxmareddyp,

Thanks for the reply. I have filed a bug at the TensorFlow repo but haven't received a resolution yet.

sawantkumar commented 3 months ago

Hi @ShabbirMarfatiya ,

There is already an issue with the same bug, so I am closing this one since its a duplicate. Please track this for reference .

google-ml-butler[bot] commented 3 months ago

Are you satisfied with the resolution of your issue? Yes No

tensorflow / tensorflow

RuntimeError when invoking TFLite INT8 model with tile operation #68474

Add other features here if necessary

Load the TensorFlow SavedModel

Set optimization to default for INT8 conversion

Set the representative dataset

Ensure that input and output tensors are quantized

Convert the model

Save the quantized model

Print the image we are going to test on as a sanity check.

Utility for visualizing results

Only visualize objects that get a score > 0.3.

Load the TFLite model and allocate tensors.

model_path = "workspace/tflite/model_6_May.tflite"

image_path = '/home/shabbirmarfatiya/Shabbir/Project/ML_Tasks/Hand_Gesture_Recognition/Hand_Keypoint_Detection/FreiHAND_Dataset/test/'+str(dir_list[15])

Initialize TensorFlow Lite Interpreter.object_detection.utils import label_map_util

Label map can be used to figure out what class ID maps to what

label. `label_map.txt` is human-readable.

category_index = {1: {'id': 1, 'name': 'person'}}

print(category_index)

Note that CenterNet doesn't require any pre-processing except resizing to the

input size that the TensorFlow Lite Interpreter was generated with.

print("kpts_scores:",kpts[0][0])

tensorflow / tensorflow

RuntimeError when invoking TFLite INT8 model with tile operation #68474

Add other features here if necessary

Load the TensorFlow SavedModel

Set optimization to default for INT8 conversion

Set the representative dataset

Ensure that input and output tensors are quantized

Convert the model

Save the quantized model

Print the image we are going to test on as a sanity check.

Utility for visualizing results

Only visualize objects that get a score > 0.3.

Load the TFLite model and allocate tensors.

model_path = "workspace/tflite/model_6_May.tflite"

image_path = '/home/shabbirmarfatiya/Shabbir/Project/ML_Tasks/Hand_Gesture_Recognition/Hand_Keypoint_Detection/FreiHAND_Dataset/test/'+str(dir_list[15])

Initialize TensorFlow Lite Interpreter.object_detection.utils import label_map_util

Label map can be used to figure out what class ID maps to what

label. label_map.txt is human-readable.

category_index = {1: {'id': 1, 'name': 'person'}}

print(category_index)

Note that CenterNet doesn't require any pre-processing except resizing to the

input size that the TensorFlow Lite Interpreter was generated with.

print("kpts_scores:",kpts[0][0])

label. `label_map.txt` is human-readable.