Execute quantized model on TFLite with QCS6490

quic / ai-hub-models

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

BSD 3-Clause "New" or "Revised" License

338 stars 45 forks source link

import numpy as np import tensorflow as tf # Load your TFLite model # Replace 'model.tflite' with the path to your actual model tflite_model_path = 'yolov7.tflite' tflite_model = open(tflite_model_path, 'rb').read() # Set up the TFLite interpreter # To use the Hexagon DSP with TensorFlow Lite, you would typically need to # build the TensorFlow Lite Hexagon delegate. However, this script assumes # that the delegate is already available and part of the TFLite runtime. interpreter = tf.lite.Interpreter( model_content=tflite_model, experimental_delegates=[tf.lite.experimental.load_delegate('libhexagon_delegate.so')] ) interpreter.allocate_tensors() # Get input and output tensors. input_details = interpreter.get_input_details() output_details = interpreter.get_output_details() # Test the model on random input data. input_shape = input_details[0]['shape'] print(f"[INFO] Input Shape = {input_shape}") input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32) interpreter.set_tensor(input_details[0]['index'], input_data) interpreter.invoke() # Get the output of the model output_data = interpreter.get_tensor(output_details[0]['index']) print(output_data)

Qualcomm has new AI Engine Direct SDK to run models on DSP/HTP via QNNTFLiteDelegate Please following steps below to setup and run tflite models on QCS6490

Download AI Engine Direct SDKs from QPM https://qpm.qualcomm.com/#/main/tools/details/qualcomm_ai_engine_direct
- NOTE: We recommend running QPM cli in linux/window environment. As of today running on macOS will not download all required libs and TFLiteDelegate.
  1. Depending on target device, copy <QNN_SDK>/libs/<target device> on device. Let's call this libs_path
  2. Depending on target DSP/HTP architecture, copy <QNN_SDK>/libs/hexagon-v<VERSION>/ on device. Let's call this skel_libs_path
- You can find this version from spec online e.g. for QCS6490 https://www.qualcomm.com/products/internet-of-things/industrial/industrial-automation/qualcomm-robotics-rb3-platform
  1. Now set following environment variables before running model on device

export LD_LIBRARY_PATH=<libs_path from point 2>
export ADSL_LIBRARY_PATH=<skel_libs_path from point 3>

Changes in above tflite script to run it correctly on-device
- Specify backend_type during load_delegate as options
- Pass <libs_path>/libQnnTFLiteDelegate.so to load_delegate
```
tf.lite.experimental.load_delegate(<libs_path> + "libQnnTFLiteDelegate.so", options={"backend_type":"htp"})
```

Attaching Sample python script and QNN 2.20 libs to try on RB3 Gen2 aarch64-ubuntu-gcc9.4.zip hexagon-v68.zip Model-and-scripts.zip

we will update our docs with these instructions soon to make it easy to deploy on IoT platforms

quic / ai-hub-models

Execute quantized model on TFLite with QCS6490 #26