Does OpenVINO support INT8 Matmul?

OpenVINO Version

2024.1.0

Operating System

Windows System

Device used for inference

NPU

Framework

None

Model used

Matmul

Issue description

I'd like to run MatMul op using(sync_benchmark [https://github.com/openvinotoolkit/openvino/tree/releases/2024/1/samples/cpp/benchmark/sync_benchmark]) with XML file. I used cmd sync_benchmark.exe matmul.xml CPU , and I got normal results. But when I ran on NPU with sync_benchmark.exe matmul.xml NPU, then I got error.

Step-by-step reproduction

Build and get sync_benchmar.exe [https://github.com/openvinotoolkit/openvino/tree/releases/2024/1/samples/cpp/benchmark/sync_benchmark]

Run cmd sync_benchmark.exe matmul.xml NPU matmul.xml :

<?xml version="1.0"?>
<net name="main_graph" version="11">
<layers>
    <layer id="1" name="input1" type="Parameter" version="opset1">
        <data shape="1,1,1024" element_type="i8" />
        <output>
            <port id="0" precision="I8" names="input1">
                <dim>1</dim>
                <dim>1</dim>
                <dim>1024</dim>
            </port>
        </output>
    </layer>
    <layer id="0" name="input2" type="Parameter" version="opset1">
        <data shape="1,1024,1024" element_type="i8" />
        <output>
            <port id="0" precision="I8" names="input2">
                <dim>1</dim>
                <dim>1024</dim>
                <dim>1024</dim>
            </port>
        </output>
    </layer>
    <layer id="2" name="output" type="MatMul" version="opset1">
        <data transpose_a="false" transpose_b="true" />
        <input>
            <port id="0" precision="I8">
                <dim>1</dim>
                <dim>1</dim>
                <dim>1024</dim>
            </port>
            <port id="1" precision="I8">
                <dim>1</dim>
                <dim>1024</dim>
                <dim>1024</dim>
            </port>
        </input>
        <output>
            <port id="2" precision="I32" names="output">
                <dim>1</dim>
                <dim>1</dim>
                <dim>1024</dim>
            </port>
        </output>
    </layer>
    <layer id="3" name="output/sink_port_0" type="Result" version="opset1">
        <input>
            <port id="0" precision="I32">
                <dim>1</dim>
                <dim>1</dim>
                <dim>1024</dim>
            </port>
        </input>
    </layer>
</layers>
<edges>
    <edge from-layer="0" from-port="0" to-layer="2" to-port="1" />
    <edge from-layer="1" from-port="0" to-layer="2" to-port="0" />
    <edge from-layer="2" from-port="2" to-layer="3" to-port="0" />
</edges>
<rt_info>
    <MO_version value="2024.1.0-15008-f4afc983258-releases/2024/1" />
    <Runtime_version value="2024.1.0-15008-f4afc983258-releases/2024/1" />
    <conversion_parameters>
        <input_model value="DIR\matmul.onnx" />
        <is_python_api_used value="False" />
    </conversion_parameters>
    <legacy_frontend value="False" />
</rt_info>
</net>

Relevant log output

No response

Issue submission checklist

[X] I'm reporting an issue. It's not a question.
[X] I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
[X] There is reproducer code and related data files such as images, videos, models, etc.

openvinotoolkit / openvino

Does OpenVINO support INT8 Matmul? #24812