google-coral / edgetpu

Coral issue tracker (and legacy Edge TPU API source)
https://coral.ai
Apache License 2.0
424 stars 125 forks source link

Converting a TF model to TFLite and then to EdgeTPU #655

Closed gsirocco closed 2 years ago

gsirocco commented 2 years ago

I have submitted the following issue to github/tensorflow without any resolution or much feedback, so trying here: https://github.com/tensorflow/tensorflow/issues/57490

basically trying to take a simple keras model with an Add operation and convert to TFLite and then to EdgeTPU. Quantization for int8 needs to take place, but depending on the conversion parameters provided it results in either an unsupported operation FlexAddV2, or unsupported data type int32, or an error with AddV2 Error code: ERROR_NEEDS_FLEX_OPS.

The model and conversion are relatively simple and straightforward:

`import tensorflow as tf from tensorflow import keras import numpy as np import random

def representativedataset(): for in range(100):

data = random.randint(0, 1)

yield [data]

data = np.random.rand(32)*2 yield [data.astype(np.int8)]

input = keras.Input(shape=(32,), name="dummy_input", dtype=tf.int8) output = tf.add(input, 1) model = keras.Model(inputs=input, outputs=output) converter = tf.lite.TFLiteConverter.from_keras_model(model) converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.representative_dataset = representative_dataset converter.target_spec.supported_ops = [ tf.lite.OpsSet.TFLITE_BUILTINS_INT8, # enable TensorFlow Lite ops. tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops. ] converter.target_spec.supported_types = [tf.int8] converter.inference_input_type = tf.int8 # or tf.uint8 converter.inference_output_type = tf.int8 # or tf.uint8 converter.experimental_new_quantizer = True # It will enable conversion and quantization of MLIR ops tflite_quant_model = converter.convert()

Output from running the conversion:

2022-09-01 14:40:33.947073: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2022-09-01 14:40:33.947098: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2022-09-01 14:40:34.954701: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory 2022-09-01 14:40:34.954726: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303) 2022-09-01 14:40:34.954741: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ubuntugsosnow): /proc/driver/nvidia/version does not exist 2022-09-01 14:40:34.954944: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. /home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/convert.py:766: UserWarning: Statistics for quantized inputs were expected, but not specified; continuing anyway. warnings.warn("Statistics for quantized inputs were expected, but not " 2022-09-01 14:40:35.192240: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:362] Ignored output_format. 2022-09-01 14:40:35.192268: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:365] Ignored drop_control_dependency. 2022-09-01 14:40:35.193142: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/tmprndll_uk 2022-09-01 14:40:35.193790: I tensorflow/cc/saved_model/reader.cc:81] Reading meta graph with tags { serve } 2022-09-01 14:40:35.193826: I tensorflow/cc/saved_model/reader.cc:122] Reading SavedModel debug info (if present) from: /tmp/tmprndll_uk 2022-09-01 14:40:35.195221: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled 2022-09-01 14:40:35.195628: I tensorflow/cc/saved_model/loader.cc:228] Restoring SavedModel bundle. 2022-09-01 14:40:35.207140: I tensorflow/cc/saved_model/loader.cc:212] Running initialization op on SavedModel bundle at path: /tmp/tmprndll_uk 2022-09-01 14:40:35.211039: I tensorflow/cc/saved_model/loader.cc:301] SavedModel load for tags { serve }; Status: success: OK. Took 17900 microseconds. 2022-09-01 14:40:35.215280: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:263] disabling MLIR crash reproducer, set env var MLIR_CRASH_REPRODUCER_DIRECTORY` to enable. loc(callsite(callsite(fused["AddV2:", callsite("model/tf.math.add/Add@inferencewrapped_model_22"("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/python/saved_model/save.py":1325:0) at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/python/saved_model/save.py":1290:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py":1248:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/convert_phase.py":205:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py":1318:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py":1338:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py":908:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py":929:0 at "/home/gsosnow/doc/gt2tf.py":27:0))))))))] at fused["PartitionedCall:", callsite("PartitionedCall@__inference_signature_wrapper_75"("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/python/saved_model/save.py":1325:0) at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/python/saved_model/save.py":1290:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py":1248:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/convert_phase.py":205:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py":1318:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py":1338:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py":908:0 at callsite("/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py":929:0 at "/home/gsosnow/doc/gt2tf.py":27:0))))))))]) at fused["PartitionedCall:", "PartitionedCall"])): error: 'tf.AddV2' op is neither a custom op nor a flex op error: failed while converting: 'main': Some ops are not supported by the native TFLite runtime, you can enable TF kernels fallback using TF Select. See instructions: https://www.tensorflow.org/lite/guide/ops_select TF Select ops: AddV2 Details: tf.AddV2(tensor<?x32xi8>, tensor) -> (tensor<?x32xi8>) : {device = ""}

Traceback (most recent call last): File "/home/gsosnow/doc/gt2tf.py", line 27, in tflite_quant_model = converter.convert() File "/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py", line 929, in wrapper return self._convert_and_export_metrics(convert_func, *args, kwargs) File "/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py", line 908, in _convert_and_export_metrics result = convert_func(self, *args, *kwargs) File "/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py", line 1338, in convert saved_model_convert_result = self._convert_as_saved_model() File "/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py", line 1320, in _convert_as_saved_model return super(TFLiteKerasModelConverterV2, File "/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/lite.py", line 1131, in convert result = _convert_graphdef( File "/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/convert_phase.py", line 212, in wrapper raise converter_error from None # Re-throws the exception. File "/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/convert_phase.py", line 205, in wrapper return func(args, kwargs) File "/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/convert.py", line 794, in convert_graphdef data = convert( File "/home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/lite/python/convert.py", line 311, in convert raise converter_error tensorflow.lite.python.convert_phase.ConverterError: /home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/python/saved_model/save.py:1325:0: error: 'tf.AddV2' op is neither a custom op nor a flex op :0: note: loc(fused["PartitionedCall:", "PartitionedCall"]): called from /home/gsosnow/anaconda3/lib/python3.9/site-packages/tensorflow/python/saved_model/save.py:1325:0: note: Error code: ERROR_NEEDS_FLEX_OPS :0: error: failed while converting: 'main': Some ops are not supported by the native TFLite runtime, you can enable TF kernels fallback using TF Select. See instructions: https://www.tensorflow.org/lite/guide/ops_select TF Select ops: AddV2 Details: tf.AddV2(tensor<?x32xi8>, tensor) -> (tensor<?x32xi8>) : {device = ""} `

hjonnala commented 2 years ago

Please try the below code, you would be able to compile the model to edgeTPU. Thanks!

import tensorflow as tf
from tensorflow import keras
import numpy as np
import random

def representative_dataset():
  for _ in range(100):
  #data = random.randint(0, 1)
  #yield [data]
    data = np.random.rand(32)*2
    yield [data.astype(np.float32)]

input = keras.Input(shape=(32,), name="dummy_input", dtype=tf.float32)
output = tf.add(input, 1)
# output = tf.keras.layers.Add()([input, input])
model = keras.Model(inputs=input, outputs=output)
print(model.summary())
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_types = [tf.int8]
converter.inference_input_type = tf.int8 # or tf.uint8 
converter.inference_output_type = tf.int8 # or tf.uint8 

tflite_quant_model = converter.convert()

with open('test_quant.tflite', 'wb') as f:
  f.write(tflite_quant_model)
gsirocco commented 2 years ago

Awesome, thanks!! This worked I was able to convert this model to utilize the TPU!!!

`edgetpu_compiler -s testmodel.tflite Edge TPU Compiler version 16.0.384591198 Started a compilation timeout timer of 180 seconds.

Model compiled successfully in 78 ms.

Input model: testmodel.tflite Input size: 1.03KiB Output model: testmodel_edgetpu.tflite Output size: 20.59KiB On-chip memory used for caching model parameters: 0.00B On-chip memory remaining for caching model parameters: 8.12MiB Off-chip memory used for streaming uncached model parameters: 0.00B Number of Edge TPU subgraphs: 1 Total number of operations: 1 Operation log: testmodel_edgetpu.log

Operator Count Status

ADD 1 Mapped to Edge TPU Compilation child process completed within timeout period. Compilation succeeded! `

google-coral-bot[bot] commented 2 years ago

Are you satisfied with the resolution of your issue? Yes No