[ONNX][Relay][QNN] tvm doesn't support mix-precision inputs for qnn matmul

Expected behavior

No check for types matching is expected for qnn matmul inputs due to there is no constraint for QLinearMatMul operation in ONNX documentation

Actual behavior

Quantized model from Hugging face failed during compilation due to matching of input tensor types.

Environment

Linux 20.04 LTE

Steps to reproduce

Usual step for compilation and launching of the onnx-model by VirtualMachine by python front-end

Triage

frontend:onnx
relay:qnn

Notes

There is similar issue and discuss for qnn conv2d. Unfortunately there is no any solution or minimal discussion. It looks like the problem is more general: tvm relays matches input types for qnn operations, but it does not assume by ONNX op description.

cc @KJlaccHoeUM9l @ehsanmok

apache / tvm