ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.84k stars 16.37k forks source link

How can I adjust the output of my TFLite exported model in order to make it work with the official TFLite android app #8874

Closed mikel-brostrom closed 2 years ago

mikel-brostrom commented 2 years ago

Search before asking

Question

I want to load my TFLite exported Yolov5s model into the official TFLite object detection android app (https://github.com/tensorflow/examples/tree/master/lite/examples/object_detection/android). The TFLite Yolov5 model outputs an array of shape [1, 25200, 17].

However, all the models (MobileNet SSD, EfficientDet Lite 0, EfficientDet Lite 1, EfficientDet Lite 2) in this app have 4 outputs: detection_boxes, detection_classes, detection_scores, num_detections. According to https://www.tensorflow.org/lite/examples/object_detection/overview#output_signature.

How should I modify:

https://github.com/ultralytics/yolov5/blob/731a2f8c1ff060bda5e84e34c7cbdd637cfe4d75/models/tf.py#L421

in order to make my Yolov5 model loadable in this app?

Additional

When loading my TFLite Yolov5 model I get:

    error getting native address of native library: task_vision_jni
    java.lang.IllegalArgumentException: Error occurred when initializing ObjectDetector: Mobile SSD models are expected to have exactly 4 outputs, found 1

    2022-07-28 13:57:52.128 22356-22483/org.tensorflow.lite.examples.objectdetection E/Test: TFLite failed to load model with error: Error getting native address of native library: task_vision_jni

which clearly states the issue in the output format

mikel-brostrom commented 2 years ago

I got passed this error by custom modifications to models/tf.py

        x = x[0][0]  # [x(1,6300,85), ...] to x(6300,85)
        xywh = x[..., :4]  # x(6300,4) boxes
        conf = x[..., 4:5]  # x(6300,1) confidences
        cls = tf.reshape(tf.cast(tf.argmax(x[..., 5:], axis=1), tf.float32), (-1, 1))  # x(6300,1)  classes

        xywh = tf.expand_dims(
            xywh, 0
        )
        conf = K.permute_dimensions(conf, (1,0))
        cls = K.permute_dimensions(cls, (1,0))
        cls = tf.expand_dims(
            cls, 0
        )
        return xywh, cls, conf, cls

Now I get:

    java.lang.IllegalArgumentException: Error occurred when initializing ObjectDetector: Expected BoundingBoxProperties for tensor number of detections, found FeatureProperties.
        at org.tensorflow.lite.task.vision.detector.ObjectDetector.initJniWithModelFdAndOptions(Native Method)
        at org.tensorflow.lite.task.vision.detector.ObjectDetector.access$000(ObjectDetector.java:88)
        at org.tensorflow.lite.task.vision.detector.ObjectDetector$1.createHandle(ObjectDetector.java:156)
        at org.tensorflow.lite.task.vision.detector.ObjectDetector$1.createHandle(ObjectDetector.java:149)
        at org.tensorflow.lite.task.core.TaskJniUtils$1.createHandle(TaskJniUtils.java:70)

Any ideas? It seems like a metadata issue

glenn-jocher commented 2 years ago

@mikel-brostrom I'm not sure. In our past conversations with Google they indicated their app was only meant for Google models unfortunately.

mikel-brostrom commented 2 years ago

Ok, I guess that is why it cannot find any helpful source related to my issue. It surprises me though that they keep the TFLite app limited to their own models. Thanks for your time again @glenn-jocher! Keep the good work up!

dvigouro commented 2 years ago

Hi. Thanks ` @mikel-brostrom hints, I managed to integrate a Yolo5s custom model in the TensorFlow Lite Object Detection Android Demo . My modifications to tf.py are:

        xyxy = self._xywh2xyxy(x[0][..., :4])[0]
        x = x[0][0]  # [x(1,6300,85), ...] to x(6300,85)
        conf = x[..., 4:5]  # x(6300,1) confidences
        cls = tf.reshape(tf.cast(tf.argmax(x[..., 5:], axis=1), tf.float32), (-1, 1))  # x(6300,1)  classes
        xyxy = tf.expand_dims(
            xyxy, 0
        )
        conf = keras.backend.permute_dimensions(conf, (1, 0))
        cls = keras.backend.permute_dimensions(cls, (1, 0))
        nb_detection = tf.math.count_nonzero(conf, axis=1, dtype=tf.float32)
        return xyxy, cls, conf, nb_detection

The fourth returned parameter should be the number of detections (and not the classes. This probably explains your error : "for tensor number of detections"). According to the ObjectDetector class definition, The boundingbox type should be BOUNDARIES (xyxy: upper_left_x, upper_left_y, width, height) and not CENTER (xywh: center_x, center_y, width, height (ref: metadata schema

The tflite file metadata should be adapted accordingly (output_location_meta.content.contentProperties.index = [0, 1, 2, 3]) Here is my script to populate the metadata:

# Create Metadata for Yolov5
# REF: 
# https://github.com/ultralytics/yolov5/issues/5784
# https://github.com/ultralytics/yolov5/issues/4760
# https://github.com/tensorflow/tflite-support/issues/703
# https://stackoverflow.com/questions/64097085/issue-in-creating-tflite-model-populated-with-metadata-for-object-detection
# https://stackoverflow.com/questions/70947940/output-tensor-from-tflite-interpreter-is-squeezed
# https://github.com/ultralytics/yolov5/issues/8874

from tflite_support import flatbuffers
from tflite_support import metadata as _metadata
from tflite_support import metadata_schema_py_generated as _metadata_fb

# Creates model info.
model_meta = _metadata_fb.ModelMetadataT()
model_meta.name = "Yolov5 object detector"
model_meta.description = ("Identify the most prominent object in the "
                          "image from a set of 1,001 categories such as "
                          "trees, animals, food, vehicles, person etc.")
model_meta.version = "v1"
model_meta.author = "dvigouro"
model_meta.license = ("Apache License. Version 2.0 "
                      "http://www.apache.org/licenses/LICENSE-2.0.")

# Creates input info.
input_meta = _metadata_fb.TensorMetadataT()
input_meta.name = "image"
input_meta.description = (
    "Input image to be detected. The expected image is 640 x 640, "
    "with three channels (red, blue, and green) per pixel. Each value in the "
    "tensor is a single byte between 0 and 255.".format(160, 160))
input_meta.content = _metadata_fb.ContentT()
input_meta.content.contentProperties = _metadata_fb.ImagePropertiesT()
input_meta.content.contentProperties.colorSpace = (
    _metadata_fb.ColorSpaceType.RGB)
input_meta.content.contentPropertiesType = (
    _metadata_fb.ContentProperties.ImageProperties)
input_normalization = _metadata_fb.ProcessUnitT()
input_normalization.optionsType = (
    _metadata_fb.ProcessUnitOptions.NormalizationOptions)
input_normalization.options = _metadata_fb.NormalizationOptionsT()
input_normalization.options.mean = [127.5]
input_normalization.options.std = [127.5]
input_meta.processUnits = [input_normalization]
input_stats = _metadata_fb.StatsT()
input_stats.max = [255]
input_stats.min = [0]
input_meta.stats = input_stats

# Creates output info.
output_location_meta = _metadata_fb.TensorMetadataT()
output_location_meta.name = "location"
output_location_meta.description = "The locations of the detected boxes."
output_location_meta.content = _metadata_fb.ContentT()
output_location_meta.content.contentPropertiesType = (
    _metadata_fb.ContentProperties.BoundingBoxProperties)
output_location_meta.content.contentProperties = (
    _metadata_fb.BoundingBoxPropertiesT())
#output_location_meta.content.contentProperties.index = [1, 0, 3, 2]
output_location_meta.content.contentProperties.index = [0, 1, 2, 3]
output_location_meta.content.contentProperties.type = (
    _metadata_fb.BoundingBoxType.BOUNDARIES)
#output_location_meta.content.contentProperties.type = (
#    _metadata_fb.BoundingBoxType.CENTER)
output_location_meta.content.contentProperties.coordinateType = (
    _metadata_fb.CoordinateType.RATIO)
output_location_meta.content.range = _metadata_fb.ValueRangeT()
output_location_meta.content.range.min = 2
output_location_meta.content.range.max = 2

output_class_meta = _metadata_fb.TensorMetadataT()
output_class_meta.name = "category"
output_class_meta.description = "The categories of the detected boxes."
output_class_meta.content = _metadata_fb.ContentT()
output_class_meta.content.contentPropertiesType = (
    _metadata_fb.ContentProperties.FeatureProperties)
output_class_meta.content.contentProperties = (
    _metadata_fb.FeaturePropertiesT())
output_class_meta.content.range = _metadata_fb.ValueRangeT()
output_class_meta.content.range.min = 2
output_class_meta.content.range.max = 2
label_file = _metadata_fb.AssociatedFileT()
label_file.name = os.path.basename(os.path.join('.', 'labelmap.txt'))
label_file.description = "Label of objects that this model can recognize."
label_file.type = _metadata_fb.AssociatedFileType.TENSOR_VALUE_LABELS
output_class_meta.associatedFiles = [label_file]

output_score_meta = _metadata_fb.TensorMetadataT()
output_score_meta.name = "score"
output_score_meta.description = "The scores of the detected boxes."
output_score_meta.content = _metadata_fb.ContentT()
output_score_meta.content.contentPropertiesType = (
    _metadata_fb.ContentProperties.FeatureProperties)
output_score_meta.content.contentProperties = (
    _metadata_fb.FeaturePropertiesT())
output_score_meta.content.range = _metadata_fb.ValueRangeT()
output_score_meta.content.range.min = 2
output_score_meta.content.range.max = 2

output_number_meta = _metadata_fb.TensorMetadataT()
output_number_meta.name = "number of detections"
output_number_meta.description = "The number of the detected boxes."
output_number_meta.content = _metadata_fb.ContentT()
output_number_meta.content.contentPropertiesType = (
    _metadata_fb.ContentProperties.FeatureProperties)
output_number_meta.content.contentProperties = (
    _metadata_fb.FeaturePropertiesT())

# Creates Tensor Groups
group = _metadata_fb.TensorGroupT()
group.name = "detection result"
group.tensorNames = [
    output_location_meta.name, output_class_meta.name,
    output_score_meta.name
]

# Creates subgraph info.
subgraph = _metadata_fb.SubGraphMetadataT()
subgraph.inputTensorMetadata = [input_meta]
subgraph.outputTensorMetadata = [output_location_meta, output_class_meta, output_score_meta,
    output_number_meta]
subgraph.outputTensorGroups = [group]

model_meta.subgraphMetadata = [subgraph]

b = flatbuffers.Builder(0)
b.Finish(
    model_meta.Pack(b),
    _metadata.MetadataPopulator.METADATA_FILE_IDENTIFIER)
metadata_buf = b.Output()

# Populates the metadata
populator = _metadata.MetadataPopulator.with_model_file('mycustom-yolov5s-fp16.tflite')
populator.load_metadata_buffer(metadata_buf)
populator.load_associated_files([os.path.join('.', 'labelmap.txt')])
populator.populate()

I can run my model on the app but I'm still testing the results and accuracy and I'm facing performance issue if I'm using the default export scripts (python.exe .\export.py --weights .\yolov5s.pt --include tflite). Anyway, debugging the ObjectDetector class is challenging for me. I think I will switch to pytorch android to get more control and support. See this Object Detection with YOLOv5 on Android app or PytorchKotlinDemo for an example using kotlin.

github-actions[bot] commented 2 years ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

mikel-brostrom commented 2 years ago

Nice @dvigouro! What kind of performance issues are you experiencing? I am interested in knowing if it runs much faster than in other Yolov5 apps like: https://github.com/lp6m/yolov5s_android?

github-actions[bot] commented 2 years ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

ShubhanCrypts commented 11 months ago

@dvigouro can you give me the repo for the android app? i have convert the model to tflite models, but i got error when running on the official example android app

fdff87554 commented 5 months ago

Hi, I was wondering if there is any continuation of this current part of the discussion, I encountered the same problem with the Error output from tflite using the official export.py.

glenn-jocher commented 5 months ago

@fdff87554 hi! If you're encountering issues with the TFLite export using the official export.py, it might be helpful to ensure that your environment meets all the requirements and that you're using the latest version of the YOLOv5 repository. Sometimes, a fresh clone and install can resolve unexpected issues:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

If the problem persists, could you please provide more details about the error message and the steps you followed? This will help in diagnosing the issue more effectively. Thanks!

fdff87554 commented 5 months ago

@fdff87554 hi! If you're encountering issues with the TFLite export using the official export.py, it might be helpful to ensure that your environment meets all the requirements and that you're using the latest version of the YOLOv5 repository. Sometimes, a fresh clone and install can resolve unexpected issues:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

If the problem persists, could you please provide more details about the error message and the steps you followed? This will help in diagnosing the issue more effectively. Thanks!

Hi, glad to receive your reply.

Let me quickly explain the current situation. I am trying to convert the customized trained model into TFLite format. One of the biggest problems encountered is the Missing of the Metadata of the TFLite Model.

I had a quick overview of the export.py code, and I saw that everyone has actually handled the conversion from PyTorch Model to Tensorflow Model very carefully. Even the differences in Input structures have been processed, but the biggest is missing metadata.

I keep getting the following Error while reading a TFLite Model in my Android Application.

Error occurred when initializing ObjectDetector: Mobile SSD models are expected to have exactly 4 outputs, found 1.

It was confirmed that the reason is that TFLite Model will contain metadata in a specific format, and additional location/category/score/number of detections information will be output during Output. However, when I looked at the tflite model exported by yolov5, there was no metadata in the model.

But this is interesting, because I have seen that after the tflite model is generated, there is an add_tflite_metadata that is adding relevant metadata, rather than not processing this matter, so I am not sure what the problem is?

--- [Updated] ---

I have found that the model file after add_tflite_metadata will generate metadata in the following non-standard format.

Metadata: populated
{
   "subgraph_metadata": [
     {
       "input_tensor_metadata": [
         {
         }
       ],
       "output_tensor_metadata": [
         {
         }
       ]
     }
   ],
   "associated_files": [
     {
       "name": "meta.txt"
     }
   ],
   "min_parser_version": "1.0.0"
}

Associated file(s) populated:
['meta.txt']

And cause the following problems:

  1. When using the official displayer parsing demonstration, no Error Message will occur (because the external structure conforms to the Metadata framework), but it does not actually provide any correct information.
    displayer = _metadata.MetadataDisplayer.with_model_file(export_model_path)
    json_file = displayer.get_metadata_json()
  2. Android App and other Metadata parsing will cause Error because there is no input / output metadata.

But the officially expected metadata should be as expected. Example .tflite file output

Metadata: populated
{
   "name": "ObjectDetector",
   "description": "Identify which of a known set of objects might be present and provide information about their positions within the given image or a video stream.",
   "subgraph_metadata": [
     {
       "input_tensor_metadata": [
         {
           "name": "image",
           "description": "Input image to be detected.",
           "content": {
             "content_properties_type": "ImageProperties",
             "content_properties": {
               "color_space": "RGB"
             }
           },
           "process_units": [
             {
               "options_type": "NormalizationOptions",
               "options": {
                 "mean": [
                   127.5
                 ],
                 "std": [
                   127.5
                 ]
               }
             }
           ],
           "stats": {
             "max": [
               255.0
             ],
             "min": [
               0.0
             ]
           }
         }
       ],
       "output_tensor_metadata": [
         {
           "name": "location",
           "description": "The locations of the detected boxes.",
           "content": {
             "content_properties_type": "BoundingBoxProperties",
             "content_properties": {
               "index": [
                 1,
                 0,
                 3,
                 2
               ],
               "type": "BOUNDARIES"
             },
             "range": {
               "min": 2,
               "max": 2
             }
           },
           "stats": {
           }
         },
         {
           "name": "category",
           "description": "The categories of the detected boxes.",
           "content": {
             "content_properties_type": "FeatureProperties",
             "content_properties": {
             },
             "range": {
               "min": 2,
               "max": 2
             }
           },
           "stats": {
           },
           "associated_files": [
             {
               "name": "ssd_mobilenet_labels.txt",
               "description": "Labels for categories that the model can recognize.",
               "type": "TENSOR_VALUE_LABELS"
             }
           ]
         },
         {
           "name": "score",
           "description": "The scores of the detected boxes.",
           "content": {
             "content_properties_type": "FeatureProperties",
             "content_properties": {
             },
             "range": {
               "min": 2,
               "max": 2
             }
           },
           "stats": {
           }
         },
         {
           "name": "number of detections",
           "description": "The number of the detected boxes.",
           "content": {
             "content_properties_type": "FeatureProperties",
             "content_properties": {
             }
           },
           "stats": {
           }
         }
       ],
       "output_tensor_groups": [
         {
           "name": "detection_result",
           "tensor_names": [
             "location",
             "category",
             "score"
           ]
         }
       ]
     }
   ],
   "min_parser_version": "1.2.0"
}

I carefully observed the source code and learned that compared with the official demonstration, there is no difference between the current add_tflite_metadata except that input/output metadata is not manually added, but I am not sure why the creation of related metadata failed.

glenn-jocher commented 5 months ago

Hi @fdff87554, thanks for the detailed explanation! It seems like the issue revolves around the metadata not being correctly populated in the TFLite model, which is crucial for compatibility with certain applications, like the TFLite Object Detection Android app.

The metadata should indeed include detailed information about input/output tensors, including their types, expected content, and associated files for labels. The discrepancy you're seeing in the metadata format might be due to how the metadata is being attached in the export.py script.

Here’s a quick suggestion: You might need to manually adjust the metadata population script to ensure all necessary fields are correctly specified. This includes setting the right content properties for each tensor and ensuring the associated files (like label maps) are correctly linked.

If you're comfortable modifying the script, you can try to explicitly define the metadata as per the structure you've found to be expected. If this sounds a bit complex or if you're unsure how to proceed, could you share the relevant portion of your export.py where the metadata is being added? This way, I can help you adjust it to fit the required format.

Thanks for your patience, and looking forward to getting this resolved! 🚀

fdff87554 commented 5 months ago

你好@fdff87554,謝謝詳細的解釋!問題似乎與 TFLite 模型中未正確填充元資料有關,這對於與某些應用程式(例如 TFLite 物件檢測 Android 應用程式)的兼容性至關重要。

元資料確實應該包括有關輸入/輸出張量的詳細信息,包括它們的類型、預期內容和標籤的關聯文件。您在元資料格式中看到的差異可能是由於腳本中附加元資料的方式造成的export.py

這是一個快速建議:您可能需要手動調整元資料填充腳本以確保正確指定所有必要的欄位。這包括為每個張量設定正確的內容屬性並確保關聯的檔案(如標籤映射)正確連結。

如果您願意修改腳本,則可以嘗試根據您發現的預期結構明確定義元資料。如果這聽起來有點複雜或您不確定如何繼續,您能否分享export.py添加元資料的相關部分?這樣,我可以幫助您調整它以適合所需的格式。

感謝您的耐心等待,並期待解決此問題! 🚀

def add_tflite_metadata(file, metadata, num_outputs):
    """
    Adds TFLite metadata to a model file, supporting multiple outputs, as specified by TensorFlow guidelines.

    https://www.tensorflow.org/lite/models/convert/metadata
    """
    with contextlib.suppress(ImportError):
        # check_requirements('tflite_support')
        from tflite_support import flatbuffers
        from tflite_support import metadata as _metadata
        from tflite_support import metadata_schema_py_generated as _metadata_fb

        tmp_file = Path("/tmp/meta.txt")
        with open(tmp_file, "w") as meta_f:
            meta_f.write(str(metadata))

        # print out tmp_file data
        with open(tmp_file, "r") as meta_f:
            print(meta_f.read())

        model_meta = _metadata_fb.ModelMetadataT()
        label_file = _metadata_fb.AssociatedFileT()
        label_file.name = tmp_file.name
        model_meta.associatedFiles = [label_file]

        subgraph = _metadata_fb.SubGraphMetadataT()
        subgraph.inputTensorMetadata = [_metadata_fb.TensorMetadataT()]
        subgraph.outputTensorMetadata = [_metadata_fb.TensorMetadataT()] * num_outputs
        model_meta.subgraphMetadata = [subgraph]

        b = flatbuffers.Builder(0)
        b.Finish(
            model_meta.Pack(b), _metadata.MetadataPopulator.METADATA_FILE_IDENTIFIER
        )
        metadata_buf = b.Output()

        populator = _metadata.MetadataPopulator.with_model_file(file)
        populator.load_metadata_buffer(metadata_buf)
        populator.load_associated_files([str(tmp_file)])
        populator.populate()
        tmp_file.unlink()

This is the current export.py code for adding tflite metadata, and its current output, as mentioned above, is

Metadata: populated
{
   "subgraph_metadata": [
     {
       "input_tensor_metadata": [
         {
         }
       ],
       "output_tensor_metadata": [
         {
         }
       ]
     }
   ],
   "associated_files": [
     {
       "name": "meta.txt"
     }
   ],
   "min_parser_version": "1.0.0"
}

Associated file(s) populated:
['meta.txt']

To comply with android usage, the period should conform to the following format:

Metadata: populated
{
   "name": "ObjectDetector",
   "description": "Identify which of a known set of objects might be present and provide information about their positions within the given image or a video stream.",
   "subgraph_metadata": [
     {
       "input_tensor_metadata": [
         {
           "name": "image",
           "description": "Input image to be detected.",
           "content": {
             "content_properties_type": "ImageProperties",
             "content_properties": {
               "color_space": "RGB"
             }
           },
           "process_units": [
             {
               "options_type": "NormalizationOptions",
               "options": {
                 "mean": [
                   127.5
                 ],
                 "std": [
                   127.5
                 ]
               }
             }
           ],
           "stats": {
             "max": [
               255.0
             ],
             "min": [
               0.0
             ]
           }
         }
       ],
       "output_tensor_metadata": [
         {
           "name": "location",
           "description": "The locations of the detected boxes.",
           "content": {
             "content_properties_type": "BoundingBoxProperties",
             "content_properties": {
               "index": [
                 1,
                 0,
                 3,
                 2
               ],
               "type": "BOUNDARIES"
             },
             "range": {
               "min": 2,
               "max": 2
             }
           },
           "stats": {
           }
         },
         {
           "name": "category",
           "description": "The categories of the detected boxes.",
           "content": {
             "content_properties_type": "FeatureProperties",
             "content_properties": {
             },
             "range": {
               "min": 2,
               "max": 2
             }
           },
           "stats": {
           },
           "associated_files": [
             {
               "name": "ssd_mobilenet_labels.txt",
               "description": "Labels for categories that the model can recognize.",
               "type": "TENSOR_VALUE_LABELS"
             }
           ]
         },
         {
           "name": "score",
           "description": "The scores of the detected boxes.",
           "content": {
             "content_properties_type": "FeatureProperties",
             "content_properties": {
             },
             "range": {
               "min": 2,
               "max": 2
             }
           },
           "stats": {
           }
         },
         {
           "name": "number of detections",
           "description": "The number of the detected boxes.",
           "content": {
             "content_properties_type": "FeatureProperties",
             "content_properties": {
             }
           },
           "stats": {
           }
         }
       ],
       "output_tensor_groups": [
         {
           "name": "detection_result",
           "tensor_names": [
             "location",
             "category",
             "score"
           ]
         }
       ]
     }
   ],
   "min_parser_version": "1.2.0"
}

I don't have a clear direction on how to modify it at the moment, I'm experimenting but would love to see what you think.

I just tried to make some simple adjustments, but it still doesn't meet expectations. I'd like to ask you to help me and take a look.

def add_tflite_metadata(file, metadata, num_outputs):
    """
    Adds TFLite metadata to a model file, supporting multiple outputs, as specified by TensorFlow guidelines.

    https://www.tensorflow.org/lite/models/convert/metadata
    """
    with contextlib.suppress(ImportError):
        # check_requirements('tflite_support')
        from tflite_support import flatbuffers
        from tflite_support import metadata as _metadata
        from tflite_support import metadata_schema_py_generated as _metadata_fb

        tmp_file = Path("/tmp/meta.txt")
        with open(tmp_file, "w") as meta_f:
            meta_f.write(str(metadata))

        # print out tmp_file data
        with open(tmp_file, "r") as meta_f:
            print(meta_f.read())

        model_meta = _metadata_fb.ModelMetadataT()
        label_file = _metadata_fb.AssociatedFileT()
        label_file.name = tmp_file.name
        model_meta.associatedFiles = [label_file]

        subgraph = _metadata_fb.SubGraphMetadataT()

        # # create input tensor metadata
        # input_meta = _metadata_fb.TensorMetadataT()

        # create output tensor metadata
        output_meta = _metadata_fb.TensorMetadataT()
        output_meta.name = 

        label_file = _metadata_fb.AssociatedFileT()
        label_file.name = os.path.basename(tmp_file)
        label_file.description = "Labels for objects that the model can recognize."
        label_file.type = _metadata_fb.AssociatedFileType.TENSOR_AXIS_LABELS
        output_meta.associatedFiles = [label_file]

        subgraph.inputTensorMetadata = [_metadata_fb.TensorMetadataT()]
        subgraph.outputTensorMetadata = [_metadata_fb.TensorMetadataT()] * num_outputs
        model_meta.subgraphMetadata = [subgraph]

        b = flatbuffers.Builder(0)
        b.Finish(
            model_meta.Pack(b), _metadata.MetadataPopulator.METADATA_FILE_IDENTIFIER
        )
        metadata_buf = b.Output()

        populator = _metadata.MetadataPopulator.with_model_file(file)
        populator.load_metadata_buffer(metadata_buf)
        populator.load_associated_files([str(tmp_file)])
        populator.populate()
        tmp_file.unlink()
glenn-jocher commented 5 months ago

Hi @fdff87554, thanks for sharing the details and your efforts in adjusting the metadata! It looks like the metadata structure needs to be more explicitly defined to match the expected format for the TFLite Object Detection model.

Here's a simplified approach to adjust the metadata in your export.py script:

  1. Define each tensor's metadata explicitly, including name, description, content properties, and associated files if necessary.
  2. Create and populate the metadata for input and output tensors according to the structure you've identified as required.

Here's a basic example to guide you on how to structure the metadata:

from tflite_support import metadata as _metadata
from tflite_support import metadata_schema_py_generated as _metadata_fb
from tflite_support import flatbuffers

def create_metadata():
    # Create model metadata.
    model_meta = _metadata_fb.ModelMetadataT()
    model_meta.name = "ObjectDetector"
    model_meta.description = "Model to detect objects in an image."
    model_meta.version = "v1"
    model_meta.author = "YOLOv5 Team"
    model_meta.license = "Public Domain"

    # Create input tensor metadata.
    input_meta = _metadata_fb.TensorMetadataT()
    input_meta.name = "image"
    input_meta.description = "Input image to be processed by the model"
    input_meta.content = _metadata_fb.ContentT()
    input_meta.content.contentProperties = _metadata_fb.ImagePropertiesT()
    input_meta.content.contentProperties.colorSpace = _metadata_fb.ColorSpaceType.RGB
    input_normalization = _metadata_fb.ProcessUnitT()
    input_normalization.optionsType = _metadata_fb.ProcessUnitOptions.NormalizationOptions
    input_normalization.options = _metadata_fb.NormalizationOptionsT()
    input_normalization.options.mean = [127.5]
    input_normalization.options.std = [127.5]
    input_meta.processUnits = [input_normalization]
    input_stats = _metadata_fb.StatsT()
    input_stats.max = [255]
    input_stats.min = [0]
    input_meta.stats = input_stats

    # Create output tensor metadata.
    output_meta = _metadata_fb.TensorMetadataT()
    output_meta.name = "detection_boxes"
    output_meta.description = "Locations of detected objects"
    output_meta.content = _metadata_fb.ContentT()
    output_meta.content.contentProperties = _metadata_fb.BoundingBoxPropertiesT()
    output_meta.content.contentProperties.index = [0, 1, 2, 3]
    output_meta.content.contentProperties.type = _metadata_fb.BoundingBoxType.BOUNDARIES
    output_meta.content.contentProperties.coordinateType = _metadata_fb.CoordinateType.RATIO

    # Assign the metadata to the model.
    subgraph = _metadata_fb.SubGraphMetadataT()
    subgraph.inputTensorMetadata = [input_meta]
    subgraph.outputTensorMetadata = [output_meta]
    model_meta.subgraphMetadata = [subgraph]

    b = flatbuffers.Builder(0)
    b.Finish(model_meta.Pack(b), _metadata.MetadataPopulator.METADATA_FILE_IDENTIFIER)
    metadata_buf = b.Output()

    return metadata_buf

def add_metadata_to_tflite(file_path, metadata_buffer):
    populator = _metadata.MetadataPopulator.with_model_file(file_path)
    populator.load_metadata_buffer(metadata_buffer)
    populator.populate()

# Usage
metadata_buffer = create_metadata()
add_metadata_to_tflite("path_to_your_model.tflite", metadata_buffer)

This example should help you get started on structuring and applying the correct metadata to your TFLite model. Adjust the details as necessary to fit your specific model's requirements. Let me know if this helps or if you need further assistance! 🚀

fdff87554 commented 4 months ago

Hi,

I'm sorry for such a late reply. My partners and I are still gradually confirming the problem and correcting it, so we haven't concluded yet.

We expect that after fixing all the problems (whether it is an Error on Android or various error messages on TFLite), we will compile a detailed correction process or report on the issues we continue to encounter in the future.

Appreciate the assistance, please give us a little more time to confirm.

glenn-jocher commented 4 months ago

Hi there,

No worries about the delay! We appreciate your diligence in investigating the issue. It's great to hear that you and your partners are actively working on identifying and resolving the problems.

To ensure we can assist you effectively, could you please provide a minimum reproducible code example? This will help us understand the exact context and reproduce the issue on our end. You can find more details on how to create one here: Minimum Reproducible Example. This step is crucial for us to investigate and provide a precise solution.

Additionally, please make sure you are using the latest versions of torch and the YOLOv5 repository from Ultralytics. Sometimes, updates can resolve unexpected issues.

Thank you for your patience and collaboration. We're here to help, so feel free to share any further details or questions you might have. Looking forward to your update! 😊