google-coral / edgetpu

Coral issue tracker (and legacy Edge TPU API source)
https://coral.ai
Apache License 2.0
422 stars 124 forks source link

edgetpu_compiler: Model not quantized #367

Closed neilyoung closed 3 years ago

neilyoung commented 3 years ago

Hi,

I have re-trained an SSD-MobileNet v1 following this tutorial https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-ssd.md. I'm able to run the retrained model using Nvidia DeepStream SDK 5.1 on a Jetson Nano Dev Kit after having it converted to an ONNX model. The inference rate is not as good as it is with other, especially fine tuned models on the Nano, but at least double as fast as the full 90 classes model.

Now I wanted to run this on a Google Coral TPU. For that I thought it would be necessary to convert the ONNX model to TFLite and from there via the edgetpu compiler to something being able to be run on the Coral.

The model retrain code (just one class "Person", 5000 images and 30 epochs) is run on a Google Colab and consists of these few lines:

# Prepare
!git clone https://github.com/dusty-nv/pytorch-ssd
%cd pytorch-ssd/
!wget https://nvidia.box.com/shared/static/djf5w54rjvpqocsiztzaandq1m3avr7c.pth -O models/mobilenet-v1-ssd-mp-0_675.pth
# Required in order to circumvent some ugly complains in the next step
!pip install urllib3==1.25.4 folium==0.2.1
!pip install -v -r requirements.txt

# Download images and annotations from Open Images Dataset
!python open_images_downloader.py --max-images=5000 --class-names "Person"

# Train. Note: Change --epoch to something between 30 and 100
!python train_ssd.py --model-dir=models/person --batch-size=4 --num-epochs=30

# Export to ONNX
!python onnx_export.py --model-dir=models/person

# Download model
from google.colab import files
files.download('models/person/ssd-mobilenet.onnx') 

I'm then converting the downloaded ssd-mobilenet.onnx to TFLite using the code below. I'm running this in a Conda Python 3.8 environment with TF 2.4.1 and tf-nighlty installed (reasons below):

import onnx
import tensorflow as tf
from onnx_tf.backend import prepare

print(tf.__version__)

# Convert model.onnx to Tensorflow
onnx_model = onnx.load('ssd-mobilenet.onnx')
onnx.checker.check_model(onnx_model) 
tf_rep = prepare(onnx_model)  
tf_rep.export_graph('ssd-mobilenet')  

# Convert saved model to tflite
converter = tf.lite.TFLiteConverter.from_saved_model('ssd-mobilenet')
tf_lite_model = converter.convert()
open('ssd-mobilenet.tflite', 'wb').write(tf_lite_model)

The reason for tf-nightly was: With TF 2.4.1 I'm running into an issue with the tf_lite conversion. With TF 2.2 I'm running into an issue with the ONNX conversion. Only tf-nightly did pass both sections.

In the end I got my ssd-mobilenet.tflite and tried to compile that on a Linux box using the edgetpu compiler.

And this is the final result:

decades@ubuntu:~$ edgetpu_compiler ssd-mobilenet.tflite 
Edge TPU Compiler version 15.0.340273435
Invalid model: ssd-mobilenet.tflite
Model not quantized

I also tried to follow and advice given here https://github.com/google-coral/edgetpu/issues/168 and set

converter.experimental_new_converter = False

to no avail.

Question is: What can be the problem?

EDIT: I have added all the steps described above to the Colab https://colab.research.google.com/drive/1y5YlBFzMAJmKR-qRnT7jzgzGRPtJ6AbM#scrollTo=KGBBDbXQrGqW

EDIT2: Is it maybe because I need to use TF 1.x for the conversion?

EDIT3: No, it also does not compile with a TF 1.15 conversion before :(

neilyoung commented 3 years ago

I'm continue to tap in the dark... I could overcome the "model not quantized" compilation error and replaced it by Internal compiler error. Aborting!". Very helpful, indeed.

My ONNX-TF-TFLite now looks like so, including optimization:

import onnx
import tensorflow as tf
from onnx_tf.backend import prepare
import numpy as np

def representative_dataset():
    for _ in range(100):
      # Not sure about this, but ONNX seems to be organized as batch_size, channels, height, width
      # Original TF code commented
      # data = np.random.rand(1, 244, 244, 3)
      data = np.random.rand(1, 3, 300, 300)
      yield [data.astype(np.float32)]

print(tf.__version__)

# Convert model.onnx to Tensorflow
onnx_model = onnx.load('/content/pytorch-ssd/models/person/ssd-mobilenet.onnx')
onnx.checker.check_model(onnx_model) 
onnx.helper.printable_graph(onnx_model.graph)
tf_rep = prepare(onnx_model)  
tf_rep.export_graph('ssd-mobilenet')  

# Convert saved model to tflite
converter = tf.lite.TFLiteConverter.from_saved_model('ssd-mobilenet')

converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8

tf_lite_model = converter.convert()
open('ssd-mobilenet.tflite', 'wb').write(tf_lite_model)

It runs and produces a .tflite, but the compilation fails with "Abort"...

neilyoung commented 3 years ago

ROFL... this works...

!edgetpu_compiler -s 'ssd-mobilenet.tflite' -m 10

neilyoung commented 3 years ago

...Well kind of :))

Number of operations that will run on Edge **TPU: 0**
Number of operations that will run on CPU: 5133

I think I quit here.

richardar commented 7 months ago

@neilyoung Have you ever figured how to make the model edge TPU compatible once converted it from onnx format?

neilyoung commented 7 months ago

No, sorry. Too long ago

Shivanshu8211 commented 2 months ago

@richardar, were you able to make the model TPU compatible? If yes, then what did you do?