PINTO0309 / Tensorflow-bin

Prebuilt binary with Tensorflow Lite enabled. For RaspberryPi / Jetson Nano. Support for custom operations in MediaPipe. XNNPACK, XNNPACK Multi-Threads, FlexDelegate.
https://qiita.com/PINTO
Apache License 2.0
500 stars 112 forks source link

Issue in running inference with tf2.5 and tf2.3 on armv7l #45

Closed saswat0 closed 2 years ago

saswat0 commented 2 years ago

Issue Type

Support

OS

RaspberryPi OS Buster

OS architecture

armv7

Hardware

RaspberryPi3

Description

I have trained a TRILL model in tf2.5.0 and have converted it into tflite by the following lines:

converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
converter._experimental_lower_tensor_list_ops = False
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
tflite_model = converter.convert()

converter.optimizations = [tf.lite.Optimize.DEFAULT]

converter.target_spec.supported_types = [tf.float16]
tflite_quant_model = converter.convert()

open("TRILL_quant.tflite", "wb").write(tflite_quant_model)

The inference is working fine when tested on a colab environment

But when I try to run the same inference with the same model, I'm getting this error

I've tried tensorflow 2.3 and 2.5 from your repository

Relevant Log Output

RuntimeError: Regular TensorFlow ops are not supported by this interpreter. Make sure you apply/link the Flex delegate before inference.Node number 2 (FlexTensorListReserve) failed to prepare.
PINTO0309 commented 2 years ago

If you use TensorFlow v2.6.0 or v2.7.0, it may work correctly. However, the build will not succeed on the armv7l architecture. This is a problem that occurs when TensorFlow is built in a full package.

Therefore, if all you care about is running TensorFlow Lite, you can use the wheel package of TensorFlowLite to install the latest version and it will even work properly. https://github.com/PINTO0309/TensorflowLite-bin

Screenshot 2022-01-13 17:07:20

download_tflite_runtime-2.7.0-cp37-none-linux_aarch64.whl.sh
download_tflite_runtime-2.7.0-cp37-none-linux_armv7l.whl.sh
download_tflite_runtime-2.7.0-cp38-none-linux_aarch64.whl.sh
download_tflite_runtime-2.7.0-cp38-none-linux_armv7l.whl.sh
download_tflite_runtime-2.7.0-cp39-none-linux_aarch64.whl.sh
download_tflite_runtime-2.7.0-cp39-none-linux_armv7l.whl.sh

Screenshot 2022-01-13 17:06:02

The figure below shows the successful execution of TRILL.tflite using tflite_runtime. You will need to install v2.7.0. Please ignore 2.8.0rc0.

$ docker run -it --rm \
-v `pwd`:/home/user/workdir \
ghcr.io/pinto0309/openvino2tensorflow:latest

$ python3 -c 'import tflite_runtime;print(tflite_runtime.__version__)'

2.8.0rc0

############################################################ interpreter = Interpreter(model_path=f'TRILL.tflite', num_threads=4) try: interpreter.allocate_tensors() except: pass input_details = interpreter.get_input_details() output_details = interpreter.get_output_details()

loop = 1 e = 0.0 result = None inp = np.ones((1, 48000), dtype=np.float32) for _ in range(loop): s = time.time() interpreter.set_tensor(input_details[0]['index'], inp) interpreter.invoke() result = interpreter.get_tensor(output_details[0]['index']) e += (time.time() - s) print('tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@') print(f'elapsed time: {e/loop*1000}ms') print(f'shape: {result.shape}') pprint(result)

- Result
```bash
$ python3 tflite_test.py 
INFO: Created TensorFlow Lite delegate for select TF ops.
2022-01-13 08:02:59.038271: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
INFO: TfLiteFlexDelegate delegate: 4 nodes delegated out of 78 nodes with 2 partitions.

INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 349.56932067871094ms
shape: (1, 20)
array([[2.9382650e-39, 4.9900355e-11, 2.9382650e-39, 2.9382650e-39,
        2.9382650e-39, 2.9382650e-39, 7.2997009e-33, 5.0400943e-21,
        2.9382650e-39, 2.9382650e-39, 2.9382650e-39, 2.9382650e-39,
        2.9382650e-39, 1.0000000e+00, 2.9382650e-39, 2.9382650e-39,
        6.6310943e-14, 2.9382650e-39, 2.9382650e-39, 2.9382650e-39]],
      dtype=float32)
saswat0 commented 2 years ago

@PINTO0309 Thanks a lot for the detailed guide. However, when I run the above script, it throws me the following error

I used download_tflite_runtime-2.7.0-cp37-none-linux_armv7l.whl.sh

Python 3.7.4 (default, Jan 12 2022, 06:05:06)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tflite_runtime.interpreter import Interpreter
>>> interpreter = Interpreter(model_path=f'TRILL.tflite', num_threads=4)
INFO: Created TensorFlow Lite delegate for select TF ops.
INFO: TfLiteFlexDelegate delegate: 4 nodes delegated out of 78 nodes with 2 partitions.

>>> interpreter.allocate_tensors()
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pi/task1/env/lib/python3.7/site-packages/tflite_runtime/interpreter.py", line 521, in allocate_tensors
    return self._interpreter.AllocateTensors()
RuntimeError: Attempting to use a delegate that only supports static-sized tensors with a graph that has dynamic-sized tensors (tensor#105 is a dynamic-sized tensor).Ignoring failed application of the default TensorFlow Lite delegate indexed at 0.
PINTO0309 commented 2 years ago

Now, please try to cross-compile v2.8.0-rc0 by yourself by referring to the official procedure below. https://github.com/tensorflow/tensorflow/tree/v2.8.0-rc0/tensorflow/lite/tools/pip_package

saswat0 commented 2 years ago

If I ignore this error, I can run the lines after that though. What does this line do exactly? And will ignoring this error be fine for inference?

PINTO0309 commented 2 years ago

If the error message is due to XNNPACK, you can ignore it. Please try to run the inference first. Instead of typing the program interactively one line at a time, use the test program I presented.

saswat0 commented 2 years ago

If I run the test program at once, it throws the above runtime error exception and exits. However, if I bypass it using a try-except block, it predicts the correct class in inference

PINTO0309 commented 2 years ago

If the inference results are displayed correctly, then it will work. Rather, the result you are seeing is the conclusion itself. However, XNNPACK, which optimizes inference performance, should be disabled.

saswat0 commented 2 years ago

Got you. Thanks a lot for the resolution! 🎉

PINTO0309 commented 2 years ago

It gave me some interesting insights that I had not intended. So I modified the test code above.