Open john-dance opened 1 month ago
Hi @john-dance
From the error messages and your description, it seems the overflow occurs when the model is loaded, specifically within the SafeIntExceptionHandler::SafeIntOnOverflow() function. This function is designed to handle cases where integer values exceed their allowable range, which can lead to crashes if not managed properly.
Agreed, but we need to figure out what is causing that overflow.
I did a little more digging. There is a Gather with an int64 index of -1. The QNN EP should be able to dispatch this Gather to QNN. It's probably hitting this overflow when trying to do the required int64 -> int32 conversion.
(I'll modify the title of the issue.)
Describe the issue
When using the QNN EP, there is an integer overflow on model load.
The model loads and runs on the CPU, or with TfLite delegate. Perhaps helpful is that with TfLite, the QNN delegate fails to prepare so it falls back to running on the GPU+CPU.
Perhaps the following TfLite messages help narrow down the problem with the QNN EP: [tflite] graph_prepare.cc:210:ERROR:could not create op: q::GatherNd.constIdx.tcm [tflite] "node_id_512_op_type_GatherNd_op_count_0" generated: could not create op
To reproduce
Use ORT + QNN EP to run the model found in this AI Hub job: https://app.aihub.qualcomm.com/jobs/jp4lvkx15 (Note: Only Microsoft QNN engineers will have access.)
Urgency
No response
Platform
Android
OS Version
14
ONNX Runtime Installation
Built from Source
Compiler Version (if 'Built from Source')
No response
Package Name (if 'Released Package')
None
ONNX Runtime Version or Commit ID
1.19.2
ONNX Runtime API
C++/C
Architecture
ARM64
Execution Provider
Other / Unknown
Execution Provider Library Version
QNN