DistilBERT model inference failure using ONNX Runtime QNNExecutionProvider on Snapdragon® X Elite NPU

sean830314 commented 1 month ago

Description: When running inference on the distilbert-base-uncased model using the NPU on Snapdragon® X Elite (X1E78100 - Qualcomm®) through ONNX Runtime's QNNExecutionProvider, the model fails to infer. However, the same model runs successfully when using the CPUExecutionProvider. The errors are related to node configuration validation failures within the ONNX model during inference.

Environment:

Device: Snapdragon® X Elite (X1E78100 - Qualcomm®) ONNX Runtime Version: onnxruntime-qnn 1.19.0 Model: distilbert-base-uncased Model Format: Optimized and quantized ONNX model (model_optimized_quantized.onnx) Execution Provider: QNNExecutionProvider Python Version: Python 3.10.11 OS: Windows 11 Code Snippet:

import onnxruntime as ort
from transformers import AutoTokenizer
import numpy as np

model_path = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_path)

execution_provider_option = {
    "backend_path": "QnnHtp.dll",
    "session.enable_htp_fp16_precision": "1",
    "htp_performance_mode": "high_performance"
}

sess_options = ort.SessionOptions()
sess_options.log_severity_level = 0

providers = ['QNNExecutionProvider']
ort_session = ort.InferenceSession("model_optimized_quantized.onnx", providers=['QNNExecutionProvider'], provider_options=[execution_provider_option])

inputs = tokenizer("Example input text", return_tensors="np")
inputs['input_ids'] = inputs['input_ids'].astype(np.int64)
inputs['attention_mask'] = inputs['attention_mask'].astype(np.int64)

try:
    outputs = ort_session.run(None, dict(inputs))
except Exception as e:
    print(e)

Error Logs:

2024-10-22 10:41:28.9871322 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/word_embeddings/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:28.9947630 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/word_embeddings/Gather_output_0_DequantizeLinear/duplicated` of type `Dequantize` with error code 3110

2024-10-22 10:41:29.0026777 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/word_embeddings/Gather_output_0_DequantizeLinear` of type `Dequantize` with error code 3110

2024-10-22 10:41:29.0111157 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.0175782 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:29.0387425 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/position_embeddings/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.0453821 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/position_embeddings/Gather_output_0_DequantizeLinear` of type `Dequantize` with error code 3110

2024-10-22 10:41:29.0547608 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/LayerNorm/Pow` of type `ElementWisePower` with error code 3110

2024-10-22 10:41:29.0787854 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/LayerNorm/Add` of type `ElementWiseAdd` with error code 3110

2024-10-22 10:41:29.0836254 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/LayerNorm/Mul` of type `ElementWiseMultiply` with error code 3110

2024-10-22 10:41:29.0883248 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/LayerNorm/Add_1` of type `ElementWiseAdd` with error code 3110

2024-10-22 10:41:29.0938929 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/q_lin/MatMul_quant_output_scale_mul` of type `ElementWiseMultiply` with error code 3110

2024-10-22 10:41:29.0984110 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/q_lin/Add` of type `ElementWiseAdd` with error code 3110

2024-10-22 10:41:29.1180688 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.1211233 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:29.1258912 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:29.1287816 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/k_lin/MatMul_quant_output_scale_mul` of type `ElementWiseMultiply` with error code 3110

2024-10-22 10:41:29.1329035 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/k_lin/Add` of type `ElementWiseAdd` with error code 3110

2024-10-22 10:41:29.1360704 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:29.1396786 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:29.1438294 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:29.1468232 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Equal` of type `ElementWiseEqual` with error code 3110

2024-10-22 10:41:29.1501107 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/v_lin/MatMul_quant_output_scale_mul` of type `ElementWiseMultiply` with error code 3110

2024-10-22 10:41:29.1532698 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/v_lin/Add` of type `ElementWiseAdd` with error code 3110

2024-10-22 10:41:29.1562286 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:29.1604085 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.1629870 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:29.1662744 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:29.1694191 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:29.1717071 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:29.1749868 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:29.1940785 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:29.1977746 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.2076521 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:29.2173333 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:29.2282815 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:29.2365697 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:29.2480647 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:29.2558181 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:29.2645440 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.2750206 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:29.3235208 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:29.3367552 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:29.3482743 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:29.3839070 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:29.4349486 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:29.4521995 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.4658055 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:29.4787242 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:29.4914660 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:29.5030083 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:29.5149475 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:29.5263062 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:29.5398282 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.6194848 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:29.6364233 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:29.6498045 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:29.6641104 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:29.6760986 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:29.6876170 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:29.7229840 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/word_embeddings/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.7461404 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/word_embeddings/Gather_output_0_DequantizeLinear/duplicated` of type `Dequantize` with error code 3110

2024-10-22 10:41:29.8042848 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/word_embeddings/Gather_output_0_DequantizeLinear` of type `Dequantize` with error code 3110

2024-10-22 10:41:29.8249188 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.8373125 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:29.8477836 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/position_embeddings/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:29.9116590 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/position_embeddings/Gather_output_0_DequantizeLinear` of type `Dequantize` with error code 3110

2024-10-22 10:41:29.9274037 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/LayerNorm/Pow` of type `ElementWisePower` with error code 3110

2024-10-22 10:41:29.9381536 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/LayerNorm/Add` of type `ElementWiseAdd` with error code 3110

2024-10-22 10:41:29.9477846 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/LayerNorm/Mul` of type `ElementWiseMultiply` with error code 3110

2024-10-22 10:41:29.9857951 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/embeddings/LayerNorm/Add_1` of type `ElementWiseAdd` with error code 3110

2024-10-22 10:41:29.9971484 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/q_lin/MatMul_quant_output_scale_mul` of type `ElementWiseMultiply` with error code 3110

2024-10-22 10:41:30.0102044 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/q_lin/Add` of type `ElementWiseAdd` with error code 3110

2024-10-22 10:41:30.0835550 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:30.0974709 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:30.1105947 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:30.1226836 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/k_lin/MatMul_quant_output_scale_mul` of type `ElementWiseMultiply` with error code 3110

2024-10-22 10:41:30.1370358 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/k_lin/Add` of type `ElementWiseAdd` with error code 3110

2024-10-22 10:41:30.2042000 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:30.2194914 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:30.2317003 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:30.2430467 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Equal` of type `ElementWiseEqual` with error code 3110

2024-10-22 10:41:30.2558380 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/v_lin/MatMul_quant_output_scale_mul` of type `ElementWiseMultiply` with error code 3110

2024-10-22 10:41:30.2704601 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/v_lin/Add` of type `ElementWiseAdd` with error code 3110

2024-10-22 10:41:30.2841350 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.0/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:30.4028658 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:30.4089427 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:30.4143851 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:30.4209290 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:30.4264575 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:30.4301982 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:30.4340360 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.1/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:30.4377872 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:30.4429231 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:30.4465753 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:30.4595797 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:30.4647680 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:30.4682391 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:30.4710961 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.2/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:30.4749570 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:30.4778831 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:30.5049757 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:30.5131264 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:30.5212589 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:30.5302470 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:30.5382602 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.3/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:30.5442839 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:30.5566445 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:30.5658993 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:30.5747744 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:30.5840854 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:30.5930702 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:30.6008437 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.4/attention/Concat_4` of type `Concat` with error code 3110

2024-10-22 10:41:30.6087052 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Gather` of type `Gather` with error code 3110

2024-10-22 10:41:30.6170346 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Unsqueeze` of type `Reshape` with error code 3110

2024-10-22 10:41:30.6258396 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Concat` of type `Concat` with error code 3110

2024-10-22 10:41:30.6877206 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Gather_1` of type `Gather` with error code 3110

2024-10-22 10:41:30.6970391 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Unsqueeze_4` of type `Reshape` with error code 3110

2024-10-22 10:41:30.7047083 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Concat_3` of type `Concat` with error code 3110

2024-10-22 10:41:30.7093309 [W:onnxruntime:, qnn_model_wrapper.cc:244 onnxruntime::qnn::QnnModelWrapper::CreateQnnNode] QNN.backendValidateOpConfig() failed for node `/distilbert/transformer/layer.5/attention/Concat_4` of type `Concat` with error code 3110

HectorSVC commented 1 month ago

When you generated the QDQ model, did you use

# Generate a suitable quantization configuration for this model.
# Note that we're choosing to use uint16 activations and uint8 weights.
qnn_config = get_qnn_qdq_config(model_to_quantize,
                                my_data_reader,
                                activation_type=QuantType.QUInt16,  # uint16 activations
                                weight_type=QuantType.QUInt8)       # uint8 weights

# Quantize the model.
quantize(model_to_quantize, output_model_path, qnn_config)

more details refer to https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html#running-a-model-with-qnn-eps-htp-backend-python

You can also try the latest nightly build which has fp16 precision enabled by default. python -m pip install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime==1.21.0.dev20241021001

ashumish-QCOM commented 4 weeks ago

Hi @sean830314,

Have you had a chance to follow the steps suggested by @HectorSVC? Could you please provide an update on your progress?

Thank you.

sean830314 commented 3 weeks ago

Thanks. HectorSVC l, ashumish-QCOM

I referred to the following link to modify the quantization code: https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html#running-a-model-with-qnn-eps-htp-backend-python.

Latest Update: I followed the previous suggestions and made adjustments, but the warnings still persist during the quantization process. The same errors regarding tensor type inference continue to appear, with multiple layers being skipped during quantization.

Environment:

Python Version: 3.11.1 (amd64) ONNX Library Versions: onnx: 1.17.0 onnxruntime: 1.19.2 onnxruntime-qnn: 1.19.0 numpy: 1.26.4

qnq_quant.py code:

import argparse
import data_reader
from onnxruntime.quantization import QuantType, quantize
from onnxruntime.quantization.execution_providers.qnn import get_qnn_qdq_config, qnn_preprocess_model

def quantize_model(model_input, model_output):

    my_data_reader = data_reader.DataReader(model_input)

    preproc_model_path = "model.preproc.onnx"
    model_changed = qnn_preprocess_model(model_input, preproc_model_path)
    model_to_quantize = preproc_model_path if model_changed else input_model_path

    qnn_config = get_qnn_qdq_config(model_to_quantize,
                                    my_data_reader,
                                    activation_type=QuantType.QUInt16,  # uint16 activations
                                    weight_type=QuantType.QUInt8)       # uint8 weights

    # Quantize the model.
    quantize(model_to_quantize, model_output, qnn_config)

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Quantize an ONNX BERT model dynamically.")

    parser.add_argument('--model_input', type=str, required=True, help='Path to the input ONNX model.')
    parser.add_argument('--model_output', type=str, required=True, help='Path to save the quantized ONNX model.')

    args = parser.parse_args()
    quantize_model(args.model_input, args.model_output)

data_reader.py code:

import numpy as np
import onnxruntime
from onnxruntime.quantization import CalibrationDataReader
from transformers import DistilBertTokenizer

class DataReader(CalibrationDataReader):
    def __init__(self, model_path: str, tokenizer_name: str = "distilbert-model", max_length: int = 512):
        self.enum_data = None
        self.tokenizer = DistilBertTokenizer.from_pretrained(tokenizer_name)
        self.max_length = max_length

        # Use inference session to get input shape.
        session = onnxruntime.InferenceSession(model_path, providers=['CPUExecutionProvider'])
        inputs = session.get_inputs()
        input_names = [inp.name for inp in inputs]

        # Generate 10 random text inputs (replace with your calibration dataset)
        # TODO: Load valid calibration input text data for your model
        example_texts = [
            "This is a sample sentence for calibration.",
            "Another example text for the calibration process.",
            "Text input to verify the calibration process.",
            "DistilBERT uses tokenizers to process inputs.",
            "Calibration data for optimizing DistilBERT.",
            "Random text generation for the calibration task.",
            "Using tokenization with the calibration input.",
            "Sentence for testing calibration accuracy.",
            "Random text for creating input data.",
            "Final example of calibration input text."
        ]

        self.data_list = []

        for text in example_texts:
            # Tokenize the text
            tokens = self.tokenizer(
                text,
                padding="max_length",
                truncation=True,
                max_length=self.max_length,
                return_tensors="np"
            )

            # Create input data for the model
            input_data = {name: tokens[name].astype(np.int64) for name in input_names if name in tokens}
            self.data_list.append(input_data)

        self.datasize = len(self.data_list)

    def get_next(self):
        if self.enum_data is None:
            self.enum_data = iter(self.data_list)
        return next(self.enum_data, None)

    def rewind(self):
        self.enum_data = None

Executed the quantization script using the command:

python.exe .\qnq_quant.py --model_input .\distilbert-model\model.onnx --model_output model.qnq.onnx

Error Messages:

WARNING:root:Please consider to run pre-processing before quantization. Refer to example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.0/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.0/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.0/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.0/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.1/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.1/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.1/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.1/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.2/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.2/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.2/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.2/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.3/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.3/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.3/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.3/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.4/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.4/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.4/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.4/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.5/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.5/ffn/activation/Mul_1_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.5/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:failed to infer the type of tensor: /distilbert/transformer/layer.5/ffn/lin2/MatMul_output_0. Skip to quantize it. Please check if it is expected.
WARNING:root:Please consider pre-processing before quantization. See https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md

ashumish-QCOM commented 3 weeks ago

Hi @sean830314,

Thanks for reporting this issue. It seems the errors are related to node configuration validation failures during inference with the QNNExecutionProvider on the Snapdragon® X Elite NPU.

Here are a few suggestions to troubleshoot:

Pre-processing Before Quantization: Run pre-processing before quantization, as suggested in Error log: Refer to the ONNX Runtime quantization guide for details.
Check Tensor Types: Ensure all tensors have the correct types and shapes before quantization. The warnings indicate some tensor types couldn't be inferred, causing quantization to be skipped.
Review Quantization Configuration: Verify your quantization configuration aligns with the QNNExecutionProvider requirements. Adjust settings or flags as needed.
Update ONNX Runtime: Ensure you're using the latest version of ONNX Runtime, as updates may resolve compatibility issues.

Let us know if it helps.

Thankyou

sean830314 commented 2 weeks ago

Hi @ashumish-QCOM

Thank you for your suggestions. I attempted the first step, "Pre-processing Before Quantization," but encountered the following error message:

PS C:\Users\kroos\Desktop\kroos\quantization-distilbert> python -m onnxruntime.quantization.preprocess --input .\distilbert-model\model.onnx --output model-infer.onnx
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\kroos\AppData\Local\Programs\Python\Python311\Lib\site-packages\onnxruntime\quantization\preprocess.py", line 127, in <module>
    quant_pre_process(
  File "C:\Users\kroos\AppData\Local\Programs\Python\Python311\Lib\site-packages\onnxruntime\quantization\shape_inference.py", line 81, in quant_pre_process
    model = SymbolicShapeInference.infer_shapes(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\kroos\AppData\Local\Programs\Python\Python311\Lib\site-packages\onnxruntime\tools\symbolic_shape_infer.py", line 2932, in infer_shapes
    raise Exception("Incomplete symbolic shape inference")
Exception: Incomplete symbolic shape inference

This error appears to be related to incomplete symbolic shape inference. I'm not sure if there are any additional settings or parameters that could prevent this error. Do you have any suggestions for resolving it?

Thank you!

microsoft / onnxruntime

DistilBERT model inference failure using ONNX Runtime QNNExecutionProvider on Snapdragon® X Elite NPU #22532