PINTO0309 / PINTO_model_zoo

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.
https://qiita.com/PINTO
MIT License
3.62k stars 576 forks source link

YoloX edgetpu compilation fail #135

Closed zye1996 closed 3 years ago

zye1996 commented 3 years ago

1. OS Ubuntu 20.04

2. OS Architecture x86_64

3. Version of OpenVINO 2021.4

4. Version of TensorFlow 2.5.0

5. URL https://github.com/PINTO0309/PINTO_model_zoo/tree/main/132_YOLOX

6. Issue Details

I tried to directly run edgetpu_compiler -s model_full_integer_quant.tflite with the downloaded model through the repository link and the compilation failed with error:

Edge TPU Compiler version 16.0.384591198
Started a compilation timeout timer of 180 seconds.
ERROR: Restored original execution plan after delegate application failure.
Compilation failed: Compilation failed due to large activation tensors in model.
Compilation child process completed within timeout period.
Compilation failed!

I am wondering is there anything I am missing during the compilation?

PINTO0309 commented 3 years ago

The basic usage of the compiler should be asked in the edgetpu repository. https://github.com/google-coral/edgetpu/issues

$ edgetpu_compiler -sad model_full_integer_quant.tflite 
Edge TPU Compiler version 16.0.384591198
Searching for valid delegate with step 1
Try to compile segment with 416 ops
Started a compilation timeout timer of 180 seconds.
ERROR: Restored original execution plan after delegate application failure.
Compilation failed: Compilation failed due to large activation tensors in model.
Compilation child process completed within timeout period.
Compilation failed! 
Try to compile segment with 415 ops
Intermediate tensors: StatefulPartitionedCall:0_int8
Started a compilation timeout timer of 180 seconds.
ERROR: Restored original execution plan after delegate application failure.
Compilation failed: Compilation failed due to large activation tensors in model.
Compilation child process completed within timeout period.
Compilation failed! 
Try to compile segment with 414 ops
Intermediate tensors: model/tf.reshape_1/Reshape_requantized,model/tf.reshape/Reshape,model/tf.reshape_2/Reshape_requantized
Started a compilation timeout timer of 180 seconds.
ERROR: Restored original execution plan after delegate application failure.
Compilation failed: Compilation failed due to large activation tensors in model.
Compilation child process completed within timeout period.
Compilation failed! 
Try to compile segment with 413 ops
Intermediate tensors: model/tf.reshape_1/Reshape_requantized,model/tf.reshape_2/Reshape_requantized,model/tf.concat_10/concat
Started a compilation timeout timer of 180 seconds.

Model compiled successfully in 1436 ms.

Input model: model_full_integer_quant.tflite
Input size: 1.25MiB
Output model: model_full_integer_quant_edgetpu.tflite
Output size: 1.49MiB
On-chip memory used for caching model parameters: 1.09MiB
On-chip memory remaining for caching model parameters: 6.10MiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 1
Total number of operations: 416
Operation log: model_full_integer_quant_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 413
Number of operations that will run on CPU: 3

Operator                       Count      Status

STRIDED_SLICE                  6          Mapped to Edge TPU
ADD                            7          Mapped to Edge TPU
PAD                            28         Mapped to Edge TPU
QUANTIZE                       21         Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation
MAX_POOL_2D                    3          Mapped to Edge TPU
CONCATENATION                  1          More than one subgraph is not supported
CONCATENATION                  17         Mapped to Edge TPU
RESHAPE                        2          Mapped to Edge TPU
RESHAPE                        1          More than one subgraph is not supported
RESIZE_NEAREST_NEIGHBOR        2          Mapped to Edge TPU
LOGISTIC                       110        Mapped to Edge TPU
CONV_2D                        83         Mapped to Edge TPU
DEPTHWISE_CONV_2D              30         Mapped to Edge TPU
MUL                            104        Mapped to Edge TPU
Compilation child process completed within timeout period.
Compilation succeeded! 
alexanderfrey commented 3 years ago

@zye1996 It turns out that the focus layer that is splitting the input image into 4 crops and concatenating those on the channel axis is causing the error. You can either get rid of the focus layer in the backend or live with the slow model that has 3 ops running on the CPU....

zye1996 commented 3 years ago

@zye1996 It turns out that the focus layer that is splitting the input image into 4 crops and concatenating those on the channel axis is causing the error. You can either get rid of the focus layer in the backend or live with the slow model that has 3 ops running on the CPU....

Hi Alex, thank you for the follow up. I also notice that after quantization the detection accuracy dropped a lot compared to float32 version yolox (I tested both nano and tiny with 416 input size). Did you also check the accuracy by any chance?

heekinho commented 3 years ago

Hi Alex, thank you for the follow up. I also notice that after quantization the detection accuracy dropped a lot compared to float32 version yolox (I tested both nano and tiny with 416 input size). Did you also check the accuracy by any chance?

I experienced this exact same issue. Although the model runs and even detect some subjects, the drop in detection accuracy seems to be way higher than what is to be expected. It seems to be a small detail somewhere. I tested using the provided nano models.

Any updates on this? Thanks for the support and great work, guys!