google-coral / edgetpu

Coral issue tracker (and legacy Edge TPU API source)
https://coral.ai
Apache License 2.0
422 stars 124 forks source link

error using edge tpu compiler #210

Closed vadim-SX closed 2 years ago

vadim-SX commented 4 years ago

when using the compiler on tflite model I get nest error: ERROR: Didn't find op for builtin opcode 'CONV_2D' version '5'

the model was compiled with tf-nightly 2.4.0.dev20200831 and ubuntu 18.04 I needed to use that version of tf becuase in previous versions I cant compile the models to tflite and I am trying to use this model from zoo ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8 on the edgetpu any idea what I need to do? thanks

Namburger commented 4 years ago

@vadim-SX for now, I suggest that you stay within tf2.2 for full compatibility. There are some new features that the released compiler haven't supported yet

vadim-SX commented 4 years ago

@Namburger the problem is that with 2.2 I cant compile the model into tflite. so, I dont know what to do.

Namburger commented 4 years ago

Humn, I'm not sure if compiler supports fpn yet, so unless that is implemented and well tested, it isn't support. Could you attach your model here? I can take a look

vadim-SX commented 4 years ago

Hi, thanks I took the model from here https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md I tried: SSD MobileNet v2 320x320 SSD MobileNet V2 FPNLite 320x320 SSD MobileNet V2 FPNLite 640x640

I want to compile one of them to run on the coral. BTW TF docker is with TF 2.3 But I wasn't able to compile neither of them to coral. ,y problem is that only with TF 2.4 I am able to convert them to tflite. if you have model in other repo that I can convert that is SSD mobilenet v2 I will be glad thans

Namburger commented 4 years ago

@vadim-SX I see, you are trying to use Object detection API for tf2.x, sorry, currently only tf1.x is supported for object detection, we are in the progressing of integrating to 2.x For mobilenet you can use this one: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03.tar.gz mobiledet is also a good one: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md#pixel4-edge-tpu-models

vadim-SX commented 4 years ago

@Namburger thanks, any Idea when It will be supported?

Namburger commented 4 years ago

No actual date yet, but we're working on it right now and has been going, lots o good progresses. The problems is that tensorflow did not provides a way to export the model to the right format for our compiler so we're finding a better way for this. You can follow our news page for any updates: https://coral.ai/news

vadim-SX commented 4 years ago

thanks for your quick replays, another thing that I was interested with: you are planing to make compilation for efficientdet and efficientnet?

Namburger commented 4 years ago

I believe efficientdet is pending progress on the tensorflow team, you can follow this thread I know it hasn't had any updates in a while :(

Efficientnet is already provided by tensorflow, here are the instructions: https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/edgetpu#post-training-quantization There should already be compilable models in those packages

alan303138 commented 3 years ago

How can i retrain Efficientnet-edgetpu using my own data on GPU ?

Namburger commented 3 years ago

@alan303138 I believe this flag can be flipped: https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/main.py#L43 Althoughyou may get a more appropriate answer here: https://github.com/tensorflow/tpu/issues

trungnguyensju commented 3 years ago

@vadim-SX I see, you are trying to use Object detection API for tf2.x, sorry, currently only tf1.x is supported for object detection, we are in the progressing of integrating to 2.x For mobilenet you can use this one: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03.tar.gz mobiledet is also a good one: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md#pixel4-edge-tpu-models

I use ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03 for transfer learning. The final model can't detect any objects. I've checked the "detection_scores" and figured out that all the scores are really small (just around .20 for highest scores). Any idea on this would be highly appreciated. Thank you!

Namburger commented 3 years ago

@trungnguyensju do you mind attaching your entire pipeline?

masc-it commented 3 years ago

No actual date yet, but we're working on it right now and has been going, lots o good progresses. The problems is that tensorflow did not provides a way to export the model to the right format for our compiler so we're finding a better way for this. You can follow our news page for any updates: https://coral.ai/news

any updates about tf 2 od api support?

luke-iqt commented 3 years ago

I also wanted to see if there is an update - Is it possible to use the TF2 Object Detection API, with a MobileNet SSD model and compile it to work on the EdgeTPU? Do I have to use TF1 if I want to use a custom model on the EdgeTPU?

luke-iqt commented 3 years ago

I manage to get a custom MobileNet SSD model I trained to compile. I used the following:

Export a TFLite compatible model

!python /tf/models/research/object_detection/export_tflite_graph_tf2.py \
  --pipeline_config_path={pipeline_file} \
  --trained_checkpoint_dir={model_dir} \
  --output_directory={model_export_dir}tflite-compatible

Use the Python API to export a Post Training Quantized trained TFLite

#https://github.com/tensorflow/models/issues/9033#issuecomment-706573546
import cv2
import glob
import numpy as np

train_images = []

def representative_data_gen():
    path = '/tf/testing/Airbus A319-115'

    dataset_list = tf.data.Dataset.list_files(path + '/*.jpg')
    for i in range(100):
        image = next(iter(dataset_list))
        image = tf.io.read_file(image)
        image = tf.io.decode_jpeg(image, channels=3)
        image = tf.image.resize(image, [300, 300])
        image = tf.cast(image / 255., tf.float32)
        image = tf.expand_dims(image, 0)
        yield [image]

converter = tf.lite.TFLiteConverter.from_saved_model(model_export_dir+"tflite-compatible/saved_model")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8,
                                       tf.lite.OpsSet.TFLITE_BUILTINS]
converter.representative_dataset = representative_data_gen
tflite_model = converter.convert()

# Save the model.
with open(model_export_dir+'model.tflite', 'wb') as f:
  f.write(tflite_model)
!edgetpu_compiler -s {model_export_dir}model.tflite -o {model_export_dir}

Edge TPU Compiler version 15.0.340273435

Model compiled successfully in 1693 ms.

Input model: /tf/dataset/mobilenet_plane_detect/model.tflite
Input size: 5.08MiB
Output model: /tf/dataset/mobilenet_plane_detect/model_edgetpu.tflite
Output size: 5.28MiB
On-chip memory used for caching model parameters: 5.16MiB
On-chip memory remaining for caching model parameters: 2.56MiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 1
Total number of operations: 112
Operation log: /tf/dataset/mobilenet_plane_detect/model_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 108
Number of operations that will run on CPU: 4

Operator                       Count      Status

CONV_2D                        55         Mapped to Edge TPU
QUANTIZE                       10         Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation
DEPTHWISE_CONV_2D              17         Mapped to Edge TPU
LOGISTIC                       1          Mapped to Edge TPU
RESHAPE                        13         Mapped to Edge TPU
CONCATENATION                  2          Mapped to Edge TPU
CUSTOM                         1          Operation is working on an unsupported data type
ADD                            10         Mapped to Edge TPU
DEQUANTIZE                     2          Operation is working on an unsupported data type
alan303138 commented 3 years ago

I manage to get a custom MobileNet SSD model I trained to compile. I used the following:

Export a TFLite compatible model

!python /tf/models/research/object_detection/export_tflite_graph_tf2.py \
  --pipeline_config_path={pipeline_file} \
  --trained_checkpoint_dir={model_dir} \
  --output_directory={model_export_dir}tflite-compatible

Use the Python API to export a Post Training Quantized trained TFLite

#https://github.com/tensorflow/models/issues/9033#issuecomment-706573546
import cv2
import glob
import numpy as np

train_images = []

def representative_data_gen():
    path = '/tf/testing/Airbus A319-115'

    dataset_list = tf.data.Dataset.list_files(path + '/*.jpg')
    for i in range(100):
        image = next(iter(dataset_list))
        image = tf.io.read_file(image)
        image = tf.io.decode_jpeg(image, channels=3)
        image = tf.image.resize(image, [300, 300])
        image = tf.cast(image / 255., tf.float32)
        image = tf.expand_dims(image, 0)
        yield [image]

converter = tf.lite.TFLiteConverter.from_saved_model(model_export_dir+"tflite-compatible/saved_model")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8,
                                       tf.lite.OpsSet.TFLITE_BUILTINS]
converter.representative_dataset = representative_data_gen
tflite_model = converter.convert()

# Save the model.
with open(model_export_dir+'model.tflite', 'wb') as f:
  f.write(tflite_model)
!edgetpu_compiler -s {model_export_dir}model.tflite -o {model_export_dir}

Edge TPU Compiler version 15.0.340273435

Model compiled successfully in 1693 ms.

Input model: /tf/dataset/mobilenet_plane_detect/model.tflite
Input size: 5.08MiB
Output model: /tf/dataset/mobilenet_plane_detect/model_edgetpu.tflite
Output size: 5.28MiB
On-chip memory used for caching model parameters: 5.16MiB
On-chip memory remaining for caching model parameters: 2.56MiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 1
Total number of operations: 112
Operation log: /tf/dataset/mobilenet_plane_detect/model_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 108
Number of operations that will run on CPU: 4

Operator                       Count      Status

CONV_2D                        55         Mapped to Edge TPU
QUANTIZE                       10         Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation
DEPTHWISE_CONV_2D              17         Mapped to Edge TPU
LOGISTIC                       1          Mapped to Edge TPU
RESHAPE                        13         Mapped to Edge TPU
CONCATENATION                  2          Mapped to Edge TPU
CUSTOM                         1          Operation is working on an unsupported data type
ADD                            10         Mapped to Edge TPU
DEQUANTIZE                     2          Operation is working on an unsupported data type

Can i know your tf 、cuda version and what GPU you using right now? I compiled ssd_mobilenet_v2_FPN_320x320 model and get this log.

Edge TPU Compiler version 15.0.340273435
Input: saved_model.tflite
Output: saved_model_edgetpu.tflite

Operator                       Count      Status

QUANTIZE                       6          Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation
QUANTIZE                       3          More than one subgraph is not supported
CONV_2D                        58         Mapped to Edge TPU
CONV_2D                        14         More than one subgraph is not supported
DEQUANTIZE                     2          Operation is working on an unsupported data type
DEPTHWISE_CONV_2D              14         More than one subgraph is not supported
DEPTHWISE_CONV_2D              37         Mapped to Edge TPU
LOGISTIC                       1          More than one subgraph is not supported
CONCATENATION                  2          More than one subgraph is not supported
PACK                           4          Tensor has unsupported rank (up to 3 innermost dimensions mapped)
RESHAPE                        6          Mapped to Edge TPU
RESHAPE                        6          More than one subgraph is not supported
ADD                            10         Mapped to Edge TPU
ADD                            2          More than one subgraph is not supported
CUSTOM                         1          Operation is working on an unsupported data type
luke-iqt commented 3 years ago

Oh Cool! I thought that TFlite was only working with SSD.

Here is my SMI info:

Sun Feb  7 07:30:51 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 450.36.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  On   | 00000000:04:00.0 Off |                  N/A |
| 20%   23C    P8     8W / 250W |  10523MiB / 11178MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     27725      C   /usr/bin/python3                10519MiB |
+-----------------------------------------------------------------------------+

I am using the FROM tensorflow/tensorflow:nightly-gpu-jupyter docker container which reports it is using TF ver: 2.5.0-dev20210203

I used transfer learning and start from this model: ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz

jagumiel commented 3 years ago

I have quantized SSD-MOBILENET-FPNLITE and SSD-MOBILENET to use them in the Edge TPU. I have obtained the following results after the compilation

==============================================================================

USING ssd_mobilenet_v2_fpnlite_320x320_coco17 Input model: /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/6_Tflite-uint8IO-experimental-quantized.tflite Input size: 3.70MiB Output model: 6_Tflite-uint8IO-experimental-quantized_edgetpu.tflite Output size: 4.21MiB On-chip memory used for caching model parameters: 3.42MiB On-chip memory remaining for caching model parameters: 4.31MiB Off-chip memory used for streaming uncached model parameters: 0.00B Number of Edge TPU subgraphs: 1 Total number of operations: 162 Operation log: 6_Tflite-uint8IO-experimental-quantized_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs. Number of operations that will run on Edge TPU: 112 Number of operations that will run on CPU: 50

Operator Count Status

LOGISTIC 1 Operation is otherwise supported, but not mapped due to some unspecified limitation QUANTIZE 4 Operation is otherwise supported, but not mapped due to some unspecified limitation QUANTIZE 1 Mapped to Edge TPU DEQUANTIZE 1 Operation is otherwise supported, but not mapped due to some unspecified limitation DEQUANTIZE 1 Operation is working on an unsupported data type CONV_2D 14 More than one subgraph is not supported CONV_2D 58 Mapped to Edge TPU DEPTHWISE_CONV_2D 37 Mapped to Edge TPU DEPTHWISE_CONV_2D 14 More than one subgraph is not supported CONCATENATION 1 Operation is otherwise supported, but not mapped due to some unspecified limitation CONCATENATION 1 More than one subgraph is not supported PACK 4 Tensor has unsupported rank (up to 3 innermost dimensions mapped) RESHAPE 4 More than one subgraph is not supported RESHAPE 2 Operation is otherwise supported, but not mapped due to some unspecified limitation RESHAPE 6 Mapped to Edge TPU CUSTOM 1 Operation is working on an unsupported data type ADD 10 Mapped to Edge TPU ADD 2 More than one subgraph is not supported

==============================================================================

USING ssd_mobilenet_v2_320x320_coco17 Input model: 9_Tflite-uint8IO-experimental-quantized.tflite Input size: 6.43MiB Output model: 9_Tflite-uint8IO-experimental-quantized_edgetpu.tflite Output size: 6.69MiB On-chip memory used for caching model parameters: 6.52MiB On-chip memory remaining for caching model parameters: 1.20MiB Off-chip memory used for streaming uncached model parameters: 0.00B Number of Edge TPU subgraphs: 1 Total number of operations: 106 Operation log: 9_Tflite-uint8IO-experimental-quantized_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs. Number of operations that will run on Edge TPU: 99 Number of operations that will run on CPU: 7

Operator Count Status

DEPTHWISE_CONV_2D 17 Mapped to Edge TPU DEQUANTIZE 2 Operation is working on an unsupported data type CONV_2D 55 Mapped to Edge TPU QUANTIZE 1 Mapped to Edge TPU QUANTIZE 4 Operation is otherwise supported, but not mapped due to some unspecified limitation LOGISTIC 1 Mapped to Edge TPU CUSTOM 1 Operation is working on an unsupported data type ADD 10 Mapped to Edge TPU RESHAPE 13 Mapped to Edge TPU CONCATENATION 2 Mapped to Edge TPU

==============================================================================

I am also getting worse timming using the FPN version. If using ssd_mobilenet I get 20ms, I am getting 200ms when using the ssd_mobilenet_fpnlite model.

I have been using Tensorflow-nightly 2.6.0 for the quantization process, and my input and output tensors are both uint8.

Are there compatibility problems with FPN networks?

Thanks in advance

hjonnala commented 3 years ago

@vadim-SX are you still having any questions here?

google-coral-bot[bot] commented 2 years ago

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

google-coral-bot[bot] commented 2 years ago

Closing as stale. Please reopen if you'd like to work on this further.

google-coral-bot[bot] commented 2 years ago

Are you satisfied with the resolution of your issue? Yes No