quic / ai-hub-models

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
https://aihub.qualcomm.com
BSD 3-Clause "New" or "Revised" License
338 stars 45 forks source link

Execute quantized model on TFLite with QCS6490 #26

Open mbenencase opened 3 months ago

mbenencase commented 3 months ago

Hi all, I have an AI-BOX with Ubuntu 20.04 from a Qualcomm OEM/ODM with the QCS6490 chipset.

I used the AI Hub website to quantize a YoloV7 model to .tflite model and I'd like to perform the model inference on the QCS6490 device mentioned above.

This is the code that I'm using for:

import numpy as np
import tensorflow as tf

# Load your TFLite model
# Replace 'model.tflite' with the path to your actual model
tflite_model_path = 'yolov7.tflite'
tflite_model = open(tflite_model_path, 'rb').read()

# Set up the TFLite interpreter
# To use the Hexagon DSP with TensorFlow Lite, you would typically need to
# build the TensorFlow Lite Hexagon delegate. However, this script assumes
# that the delegate is already available and part of the TFLite runtime.
interpreter = tf.lite.Interpreter(
    model_content=tflite_model,
    experimental_delegates=[tf.lite.experimental.load_delegate('libhexagon_delegate.so')]
)
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test the model on random input data.
input_shape = input_details[0]['shape']

print(f"[INFO] Input Shape = {input_shape}")

input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()

# Get the output of the model
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

And my question is where can I find, or where do I download the libhexagon_delegate.so library.

bhushan23 commented 3 months ago

Qualcomm has new AI Engine Direct SDK to run models on DSP/HTP via QNNTFLiteDelegate Please following steps below to setup and run tflite models on QCS6490

  1. Download AI Engine Direct SDKs from QPM https://qpm.qualcomm.com/#/main/tools/details/qualcomm_ai_engine_direct
    • NOTE: We recommend running QPM cli in linux/window environment. As of today running on macOS will not download all required libs and TFLiteDelegate.
      1. Depending on target device, copy <QNN_SDK>/libs/<target device> on device. Let's call this libs_path
      2. Depending on target DSP/HTP architecture, copy <QNN_SDK>/libs/hexagon-v<VERSION>/ on device. Let's call this skel_libs_path
    • You can find this version from spec online e.g. for QCS6490 https://www.qualcomm.com/products/internet-of-things/industrial/industrial-automation/qualcomm-robotics-rb3-platform
      1. Now set following environment variables before running model on device
export LD_LIBRARY_PATH=<libs_path from point 2>
export ADSL_LIBRARY_PATH=<skel_libs_path from point 3>
  1. Changes in above tflite script to run it correctly on-device
    • Specify backend_type during load_delegate as options
    • Pass <libs_path>/libQnnTFLiteDelegate.so to load_delegate
      tf.lite.experimental.load_delegate(<libs_path> + "libQnnTFLiteDelegate.so", options={"backend_type":"htp"})

Attaching Sample python script and QNN 2.20 libs to try on RB3 Gen2 aarch64-ubuntu-gcc9.4.zip hexagon-v68.zip Model-and-scripts.zip

we will update our docs with these instructions soon to make it easy to deploy on IoT platforms