yolov4 tflite for Google Coral EdgeTPU

jurenovic commented 4 years ago

Hi @hunglc007 first thanks for all the work you've done.

I have followed you tutorial on how to create a yolov4-int8.tflite. Then I got it successfully converted to yolov4-int8_edgetpu.tflite using edgetpu_compiler.

Followed the steps from https://github.com/google-coral/tflite/tree/master/python/examples/detection and instead of running

python3 detect_image.py \
  --model models/mobilenet_ssd_v2_coco_quant_postprocess_edgetpu.tflite \
  --labels models/coco_labels.txt \
  --input images/grace_hopper.bmp \
  --output images/grace_hopper_processed.bmp

i tried with

python3 detect_image.py \
  --model models/yolov4-int8_edgetpu.tflite \
  --labels models/coco_labels.txt \
  --input images/grace_hopper.bmp \
  --output images/grace_hopper_processed.bmp

but it fails with this error.

Traceback (most recent call last):
  File "detect_image.py", line 129, in <module>
    main()
  File "detect_image.py", line 108, in main
    objs = detect.get_output(interpreter, args.threshold, scale)
  File "/google-coral/tflite/python/examples/detection/detect.py", line 147, in get_output
    count = int(output_tensor(interpreter, 3))
  File "/google-coral/tflite/python/examples/detection/detect.py", line 138, in output_tensor
    tensor = interpreter.tensor(interpreter.get_output_details()[i]['index'])()
IndexError: list index out of range

The output from edgetpu_compiler is:

Edge TPU Compiler version 2.1.302470888

Model compiled successfully in 521 ms.

Input model: yolov4-int8.tflite
Input size: 62.67MiB
Output model: yolov4-int8_edgetpu.tflite
Output size: 62.58MiB
On-chip memory used for caching model parameters: 3.00KiB
On-chip memory remaining for caching model parameters: 7.84MiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 1
Total number of operations: 962
Operation log: yolov4-int8_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 1
Number of operations that will run on CPU: 961
See the operation log file for individual operation details.

Log file:

Edge TPU Compiler version 2.1.302470888
Input: yolov4-int8.tflite
Output: yolov4-int8_edgetpu.tflite

Operator                       Count      Status

LOGISTIC                       6          Tensor has unsupported rank (up to 3 innermost dimensions mapped)
EXP                            72         Operation is working on an unsupported data type
MAX_POOL_2D                    3          More than one subgraph is not supported
MUL                            72         More than one subgraph is not supported
QUANTIZE                       16         More than one subgraph is not supported
QUANTIZE                       196        Operation is otherwise supported, but not mapped due to some unspecified limitation
RESHAPE                        3          Tensor has unsupported rank (up to 3 innermost dimensions mapped)
TANH                           72         More than one subgraph is not supported
ADD                            95         More than one subgraph is not supported
RESIZE_NEAREST_NEIGHBOR        2          More than one subgraph is not supported
PAD                            7          More than one subgraph is not supported
LEAKY_RELU                     35         Operation is working on an unsupported data type
CONCATENATION                  3          Tensor has unsupported rank (up to 3 innermost dimensions mapped)
CONCATENATION                  10         More than one subgraph is not supported
CONV_2D                        1          Mapped to Edge TPU
CONV_2D                        109        More than one subgraph is not supported
SPLIT_V                        3          Tensor has unsupported rank (up to 3 innermost dimensions mapped)
DEQUANTIZE                     6          Tensor has unsupported rank (up to 3 innermost dimensions mapped)
DEQUANTIZE                     179        Operation is working on an unsupported data type
LOG                            72         Operation is working on an unsupported data type

Is it even possible to run yolov4 on Google Coral? Would appreciate some help. Thanks

vinorth-v commented 4 years ago

Hi, can you give the output of edgetpu_compiler command?

jurenovic commented 4 years ago

@vinorth05 I updated the original post ☝️

lthbk5919 commented 4 years ago

@AntonAmes Hi,

If you need Yolo V4 for Edge computing I recommend to use the Nvidia Jetson Nano (With TensorRT Support) (YoloV4 fp16 at 11fps)

I'm running YoloV4 fp16 on Nvidia Jetson Nano with 416 size, but i get ~4FPS follow tkDNN https://github.com/ceccocats/tkDNN I'm using Jetson Nano, Jetpack 4.4 (CUDA 10.2, CUDNN 8.0.0, tensorrt 7.1.0 ). Am I wrong?

lthbk5919 commented 4 years ago

Hi, @AntonAmes

I used Pytorch to archive the 11 FPS result: https://www.seeedstudio.com/blog/2020/06/03/accelerate-yolov4-real-time-object-detection-on-jetson-nano/

I see inference time ~0.747s (1.34fps). But you say 11fps . Could you explain me?

BasavaG commented 2 years ago

Hi, i am struggling to convert saved model.pb to tflite. Can anyone please help with detailed procedure. Please 🙏