google-coral / edgetpu

Coral issue tracker (and legacy Edge TPU API source)
https://coral.ai
Apache License 2.0
430 stars 125 forks source link

On the meaning implied by "More than one subgraph is not supported" #461

Closed PINTO0309 closed 3 years ago

PINTO0309 commented 3 years ago

Description

1. Overview

Using the EdgeTPU Compiler in v16, 99% of operations are converted to TPU models, but only 4 operations close to the input of the model are not converted. This indicates that most of the operations can now be converted by upgrading from the past compiler to the latest compiler.

The part that I want to know the cause and how to fix is why the remaining four Mean x2, Sub x1, and SquaredDifference x1 fail to convert. I understand that previous compilers sometimes failed to convert large resolution input tensors, but this has been resolved in the latest compilers thanks to the efforts of the engineers. This time, the input resolution is not very large, [1, 256, 128, 3].

  1. I do have the means to modify the model by hand, so if there is a workaround, I would appreciate it.
  2. I would like to know the real meaning of the warning message "More than one subgraph is not supported" that is displayed even though the model has a very simple structure.

2. Environment

  1. Ubuntu 20.04 x86_64
  2. TensorFlow v2.6.0
  3. Edge TPU Compiler version 16.0.384591198
  4. PINTO0309/openvino2tensorflow v1.19.3 (Model transformation tool I created.)

3. Steps to Reproduce

I have converted Intel's ReIdentification model to the EdgeTPU model, which is committed at the following URL. https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/person-reidentification-retail-0277

The conversion flow is as follows. 1.OpenVINO IR (.xml/.bin, NCHW) -> 2.TensorFlow saved_model (NHWC) -> 3.Full Integer Quantized tflite -> 4.EdgeTPU tflite

The files generated in each step of the conversion process are the files attached below.

  1. person-reidentification-retail-0277_xml_bin.zip
  2. tensorflow_saved_model.zip
  3. float32_int8_fullint8_tflite.zip
  4. edgetpu_tflite.zip

The log when the last EdgeTPU tflite was generated is shown below.

Edge TPU Compiler version 16.0.384591198
Input: saved_model/model_full_integer_quant.tflite
Output: saved_model/model_full_integer_quant_edgetpu.tflite

Operator                       Count      Status

ADD                            65         Mapped to Edge TPU
LOGISTIC                       24         Mapped to Edge TPU
MAX_POOL_2D                    1          Mapped to Edge TPU
PAD                            61         Mapped to Edge TPU
MEAN                           2          More than one subgraph is not supported
MEAN                           50         Mapped to Edge TPU
PRELU                          1          Mapped to Edge TPU
QUANTIZE                       2          Mapped to Edge TPU
CONV_2D                        103        Mapped to Edge TPU
FULLY_CONNECTED                1          Mapped to Edge TPU
RSQRT                          14         Mapped to Edge TPU
SPLIT                          1          Mapped to Edge TPU
CONCATENATION                  1          Mapped to Edge TPU
RESHAPE                        26         Mapped to Edge TPU
AVERAGE_POOL_2D                2          Mapped to Edge TPU
DEPTHWISE_CONV_2D              61         Mapped to Edge TPU
MUL                            53         Mapped to Edge TPU
SUB                            1          More than one subgraph is not supported
SUB                            13         Mapped to Edge TPU
SQUARED_DIFFERENCE             1          More than one subgraph is not supported
SQUARED_DIFFERENCE             13         Mapped to Edge TPU

A portion of the structure of the Full Integer Quantized tflite just before conversion to the EdgeTPU model is shown in the figure below. Screenshot 2021-09-03 21:15:35

The structure of the generated EdgeTPU model is shown in the figure below. Screenshot 2021-09-03 21:13:53

4. Script for model transformation

For --model_path ${MODEL}.xml, specify the name of the xml file you downloaded from person-reidentification-retail-0277_xml_bin.zip. --string_formulas_for_normalization 'data * 1' is a formula for normalizing the dataset used during quantization. If you set 'data / 255', it means that the data set for calibration is divided by 255 to normalize it.

$ docker run -it --rm \
  -v `pwd`:/home/user/workdir \
  pinto0309/openvino2tensorflow:latest

$ cd workdir

$ MODEL=person-reidentification-retail-0277

$ openvino2tensorflow \
  --model_path ${MODEL}.xml \
  --output_saved_model \
  --output_pb \
  --output_no_quant_float32_tflite \
  --output_integer_quant_tflite \
  --string_formulas_for_normalization 'data * 1' \
  --output_integer_quant_type 'uint8' \
  --output_edgetpu

Inside the conversion script,

edgetpu_compiler -sad -t 3600 model_full_integer_quant.tflite

is running to convert the EdgeTPU model.

Issue Type

Support

Operating System

Ubuntu

Coral Device

USB Accelerator, Accelerator Module

Other Devices

No response

Programming Language

Python 3.8

Relevant Log Output

No response

hjonnala commented 3 years ago

Hi @PINTO0309 edgetpu compiler by default supports only one sub graph. Let's say if we compile the model without -a flag and it has 20 operations, if first 10 operations mapped to edgetpu and 11th operation not able to map to edgetpu for remaining 10 operations (except quantize) we can see the message "More than one subgraph is not supported".

This message won't be appear if we use -a flag. In your case it might be because of d flag you are still seeing "More than one sub graph is not supported".

Please try the attachment without -a flag and with -a flag and visualize the model using https://netron.app/

420_edgetpu_test_quant.tflite.tar.gz

hjonnala commented 3 years ago

It seems Mean operation with input size (1x256x128x3) not able to mapped to edgetpu. If possible, Can you try reducing these Mean size operations or eliminating them.

PINTO0309 commented 3 years ago

Thank you. @hjonnala

Although I was a little skeptical, I split [1, 256, 128, 3] into [1, 64, 128, 3] with Split, then processed each of them with Mean, and finally merged the results of each calculation, and all operations were mapped to TPUs.

I didn't change the compile option and left it at edgetpu_compiler -sad -t 3600 model_full_integer_quant.tflite

If there is a limit to the size of the tensors that can be processed in each layer, it would be nice if you could make it clear in the error message. The error messages that are currently displayed are very confusing.

Screenshot 2021-09-05 15:32:13 Screenshot 2021-09-05 15:33:42 Screenshot 2021-09-05 15:34:07

Full log.

Edge TPU Compiler version 16.0.384591198
Searching for valid delegate with step 1
Try to compile segment with 514 ops
Started a compilation timeout timer of 3600 seconds.

Model compiled successfully in 4598 ms.

Input model: saved_model/model_full_integer_quant.tflite
Input size: 2.65MiB
Output model: saved_model/model_full_integer_quant_edgetpu.tflite
Output size: 5.00MiB
On-chip memory used for caching model parameters: 100.00KiB
On-chip memory remaining for caching model parameters: 0.00B
Off-chip memory used for streaming uncached model parameters: 3.83MiB
Number of Edge TPU subgraphs: 1
Total number of operations: 514
Operation log: saved_model/model_full_integer_quant_edgetpu.log

Operator                       Count      Status

MEAN                           58         Mapped to Edge TPU
MUL                            55         Mapped to Edge TPU
CONV_2D                        103        Mapped to Edge TPU
DEPTHWISE_CONV_2D              61         Mapped to Edge TPU
LOGISTIC                       24         Mapped to Edge TPU
RESHAPE                        26         Mapped to Edge TPU
FULLY_CONNECTED                1          Mapped to Edge TPU
CONCATENATION                  1          Mapped to Edge TPU
MAX_POOL_2D                    1          Mapped to Edge TPU
QUANTIZE                       2          Mapped to Edge TPU
SQUARED_DIFFERENCE             17         Mapped to Edge TPU
AVERAGE_POOL_2D                2          Mapped to Edge TPU
RSQRT                          14         Mapped to Edge TPU
PRELU                          1          Mapped to Edge TPU
SUB                            14         Mapped to Edge TPU
PAD                            61         Mapped to Edge TPU
ADD                            71         Mapped to Edge TPU
SPLIT                          2          Mapped to Edge TPU
Compilation child process completed within timeout period.
Compilation succeeded! 

EdgeTPU convert complete! - saved_model/model_full_integer_quant_edgetpu.tflite