microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.81k stars 2.94k forks source link

Accessing Resize op (ResizeNearestNeighbor) in QNN #22549

Open plaurent opened 1 month ago

plaurent commented 1 month ago

Describe the issue

Currently the Resize operation is not supported by QNN on GPU, but ResizeNearestNeighbor and ResizeBilinear are supported (see attached image and https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/SupportedOps.html).

Is there a way for an Onnx model leverage those supported ops like ResizeNearestNeighbor? So far as I can tell, there's no such thing as a ResizeNearestNeighbor ONNX operator, according to the list here: https://onnx.ai/onnx/operators/

I'm trying to run a YOLO model on a QCS6490 GPU, but it contains a couple of Resize nodes which are not supported:

2024-10-16 20:37:47.101839048 [W:onnxruntime:, [qnn_model_wrapper.cc:240](http://qnn_model_wrapper.cc:240/) CreateQnnNode] QNN.backendValidateOpConfig() failed for node `Resize` of type `Resize` with error code 4005
 2024-10-16 20:37:47.101913737 [W:onnxruntime:, [qnn_execution_provider.cc:364](http://qnn_execution_provider.cc:364/) IsNodeSupported] Resize node `Resize` is not supported: [base_op_builder.cc:155](http://base_op_builder.cc:155/) ProcessOutputs Failed to add node.

I would like to replace those Resize nodes with ResizeNearestNeighbor if that would allow us to run.

I'm running with onnxruntime-rel-1.18.2 and QAIRT / QNN SDK v2.22.6.240515.zip, which is a working combination apart from the lack of Resize node support.

Thank you.

Image

To reproduce

  1. Export YOLOv8n as onnx opset 11
  2. Attempt to run inference using libQnnGpu.so

Urgency

No response

Platform

Linux

OS Version

Linux qcs6490-odk 5.4.233-perf #1 SMP PREEMPT Wed Apr 3 03:19:05 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

rel-1.18.2 commit 9691af1a2a17b12af04652f4d8d2a18ce9507025

ONNX Runtime API

Python

Architecture

ARM64

Execution Provider

SNPE

Execution Provider Library Version

QAIRT QNN 2.22.6.240515

HectorSVC commented 1 month ago

We do map Onnx Resize op to ResizeNearestNeighbor for some cases: https://github.com/microsoft/onnxruntime/blob/fc2be09386fe8c195c224b1cbb5b15a1277e0209/onnxruntime/core/providers/qnn/builder/opbuilder/resize_op_builder.cc#L276-L281

Could you try the latest code with latest QNN 2.27?

plaurent commented 1 month ago

Thanks for your reply @HectorSVC....

We do map Onnx Resize op to ResizeNearestNeighbor for some cases:

onnxruntime/onnxruntime/core/providers/qnn/builder/opbuilder/resize_op_builder.cc

Lines 276 to 281 in fc2be09

if (is_npu_backend && input_rank == 4 && interp_mode == "nearest" && nearest_mode == "floor") { // Translate Resize with // {input_rank: 4, mode: "nearest", nearest_mode: "floor", coordinate_transformation_mode: XXX} to // QNN's ResizeNearestNeighbor operator on the HTP backend. This combination of parameters is not supported on HTP // via QNN's Resize operator. Note that QNN's ResizeNearestNeighbor operator always uses "floor" rounding. qnn_op_type = "ResizeNearestNeighbor";

I note that mapping is guarded by is_npu_backend, which I believe is only true if targeting HTP (it is false for GPU — which we want to use).

Should I try to remove that guard, or is there some way to get the mapping to work for the GPU backend?

Thank you.

Could you try the latest code with latest QNN 2.27?

I will also give that a try — but it looks like we need to get the mapping to work either way, correct?

plaurent commented 1 month ago

@HectorSVC When I download QNN from https://www.qualcomm.com/developer/software/neural-processing-sdk-for-ai I get v2.26.0.240828 -- is there a 2.27 somewhere else? Thank you.

HectorSVC commented 1 month ago

We have a QNN download link from our webpage: https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html https://qpm.qualcomm.com/main/tools/details/qualcomm_ai_engine_direct

We don't support GPU backend for now since we haven't seen a strong desire for that. Please let us know your concern, you usage scenarios.

plaurent commented 1 month ago

We have a QNN download link from our webpage: https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html https://qpm.qualcomm.com/main/tools/details/qualcomm_ai_engine_direct

When I click that link I see "no releases available" and "no documents found" (see screen capture below). Is there some special permissions I would need to be able to see QNN 2.27?

We don't support GPU backend for now since we haven't seen a strong desire for that. Please let us know your concern, you usage scenarios.

It's surprising there's not a strong desire for GPU support. We are very interested in this, as we have customers who like the price point of running custom trained computer vision/deep learning networks on Qualcomm hardware, instead of the competition. These customers have hundreds/thousands of sites with only 1-3 cameras each, so larger edge devices from the competition are overkill/overpriced.

However, our customers would like to run the same models we currently offer on other edge devices. This is why we are interested in using the GPU if possible — ideally without modifying the models (as we'd have to for the DSP/HTP).

We do map Onnx Resize op to ResizeNearestNeighbor for some cases:

If I remove is_npu_backend && from the code below and rebuild, might that allow OnnxRuntime to map the Onnx Resize op to QNN ResizeNearestNeighbor op for the GPU? Our network is indeed using "nearest" with "floor"... it's simply that OnnxRuntime is not mapping it to ResizeNearestNeighbor for the GPU.

if (is_npu_backend && input_rank == 4 && interp_mode == "nearest" && nearest_mode == "floor") {

Thank you.

(Screen shot showing no access to QNN 2.27 from the shared link:) Image

github-actions[bot] commented 3 days ago

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.