Enable Edge TPU export and inference for newer Tensorflow versions

paradigmn commented 3 years ago

As discussed in issue #20, the new tflite converter does not work together with the yolov4 implementation. To fix this, the new converter was disabled. Additionally, input and output quantization was switched to uint8 in order to avoid unnecessary casting.

The tflite inference module was extended to deal with quantized inputs and outputs. The module now priorises tflite_runtime before the buildin tensorflow implementation. This should make edge tpu inference easier.

I added two scripts for model export and inference to evaluate the code.

hhk7734 commented 3 years ago

What is my fault?

Host

$ python3 -m pip show tensorflow                   
Name: tensorflow
Version: 2.3.1
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /home/hhk7734/tensorflow-yolov4/venv/lib/python3.8/site-packages
Requires: wheel, six, grpcio, astunparse, h5py, protobuf, tensorflow-estimator, numpy, wrapt, opt-einsum, tensorboard, gast, absl-py, google-pasta, termcolor, keras-preprocessing
Required-by:

python3 export.py

$ edgetpu_compiler -a quant_model.tflite          
Edge TPU Compiler version 15.0.340273435

Model compiled successfully in 1153 ms.

Input model: quant_model.tflite
Input size: 5.94MiB
Output model: quant_model_edgetpu.tflite
Output size: 6.33MiB
On-chip memory used for caching model parameters: 6.06MiB
On-chip memory remaining for caching model parameters: 1.66MiB
Off-chip memory used for streaming uncached model parameters: 3.38KiB
Number of Edge TPU subgraphs: 3
Total number of operations: 132
Operation log: quant_model_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 101
Number of operations that will run on CPU: 31
See the operation log file for individual operation details.

Coral (EdgeTPU)

$ python3 -m pip show tflite_runtime
Name: tflite-runtime
Version: 2.1.0.post1
Summary: TensorFlow Lite is for mobile and embedded devices.
Home-page: https://www.tensorflow.org/lite/
Author: Google, LLC
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /usr/local/lib/python3.7/dist-packages
Requires: numpy
Required-by:

from py_src.yolov4.tflite import YOLOv4
import cv2

yolo = YOLOv4(tiny=True, tpu=True)

yolo.classes = "coco.names"

yolo.load_tflite("quant_model_edgetpu.tflite")

yolo.inference("road.mp4", is_image=False)

Traceback (most recent call last):
  File "edge_yolov4_tiny_video_test.py", line 8, in <module>
    yolo.load_tflite("quant_model_edgetpu.tflite")
  File "/home/mendel/py_src/yolov4/tflite/__init__.py", line 60, in load_tflite
    self.interpreter.allocate_tensors()
  File "/usr/local/lib/python3.7/dist-packages/tflite_runtime/interpreter.py", line 242, in allocate_tensors
    return self._interpreter.AllocateTensors()
  File "/usr/local/lib/python3.7/dist-packages/tflite_runtime/interpreter_wrapper.py", line 115, in AllocateTensors
    return _interpreter_wrapper.InterpreterWrapper_AllocateTensors(self)
RuntimeError: tensorflow/lite/kernels/split_v.cc:131 input_type == kTfLiteFloat32 || input_type == kTfLiteUInt8 || input_type == kTfLiteInt16 || input_type == kTfLiteInt32 || input_type == kTfLiteInt64 was not true.Node number 1 (SPLIT_V) failed to prepare.

paradigmn commented 3 years ago

Hi, the video demo is working on my part (although with some boxes in the sky): Screenshot_20201215_110259

You haven't installed the most recent versions of libedgetpu1-max and tflite runtime!

Name: tflite-runtime Version: 2.5.0 Summary: TensorFlow Lite is for mobile and embedded devices. Home-page: https://www.tensorflow.org/lite/ Author: Google, LLC Author-email: packages@tensorflow.org License: Apache 2.0

hhk7734 commented 3 years ago

I used libedgetpu1-std for comparison.

input_size = (512, 384)

Before without -a, FPS: 11 ~ 12 with -a, FPS: 9 ~ 10

After without -a, FPS: 4 ~ 5 with -a, FPS: 11 ~ 13

The speed is about the same, but more accurate. :) Goooood!

hhk7734 commented 3 years ago

Please run black -l 80 . for lint. :)

paradigmn commented 3 years ago

done ^^

hhk7734 / tensorflow-yolov4

Enable Edge TPU export and inference for newer Tensorflow versions #45

Host

Coral (EdgeTPU)