openvinotoolkit / openvino.genai

Run Generative AI models with simple C++/Python API and using OpenVINO Runtime
Apache License 2.0
150 stars 171 forks source link

Dynamic Shape Issue When Run Whisper On NPU #895

Open weberwcwei opened 1 month ago

weberwcwei commented 1 month ago

Description

When attempting to run a Whisper model on NPU, an error occurs indicating that the shape is dynamic. This prevents the model from being executed on the NPU. Is there any example to ensuring that the model has static shapes?

Sample Code

import openvino_genai as ov_genai

MODEL_PATH = "model_zoo/whisper-base-openvino"
pipe = ov_genai.WhisperPipeline(MODEL_PATH, device="NPU")

Error Message

RuntimeError: Exception from src/inference/src/cpp/core.cpp:124:
Exception from src/inference/src/dev/plugin.cpp:58:
Exception from src/plugins/intel_npu/src/plugin/src/plugin.cpp:697:
Exception from src/plugins/intel_npu/src/plugin/src/compiled_model.cpp:62:
Exception from src/core/src/partial_shape.cpp:266:
to_shape was called on a dynamic shape.

Environment

andrei-kochin commented 1 month ago

Hello @weberwcwei,

Whisper pipeline is not yet released and still in progress. But it is a valuable feedback!

@dmatveev @TolyaTalamanov @as-suvorov FYI

weberwcwei commented 1 month ago

Hello @andrei-kochin

Thanks for the update! Do you have any timeline on when the Whisper pipeline is expected to be released?

andrei-kochin commented 1 month ago

@weberwcwei no exact timeline for NPU unfortunately but team is already working on it

andrei-kochin commented 2 weeks ago

@weberwcwei good news! you can try the NPU with the latest source or nightly builds

weberwcwei commented 2 weeks ago

@andrei-kochin , thank you and the genai team for updating these features

I tried out the nightly builds and tested Whisper on the NPU, but I encountered another error. Could you help me troubleshoot this?

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[2], [line 4](vscode-notebook-cell:?execution_count=2&line=4)
      [1](vscode-notebook-cell:?execution_count=2&line=1) import openvino_genai as ov_genai
      [3](vscode-notebook-cell:?execution_count=2&line=3) MODEL_PATH = "models/whisper-base"
----> [4](vscode-notebook-cell:?execution_count=2&line=4) pipe = ov_genai.WhisperPipeline(MODEL_PATH, device="NPU")

RuntimeError: Exception from src\inference\src\cpp\core.cpp:107:
Exception from src\inference\src\dev\plugin.cpp:53:
Exception from src\plugins\intel_npu\src\plugin\src\plugin.cpp:705:
Exception from src\plugins\intel_npu\src\plugin\src\compiled_model.cpp:66:
Exception from src\plugins\intel_npu\src\compiler\src\zero_compiler_in_driver.cpp:828:
L0 pfnCreate2 result: ZE_RESULT_ERROR_INVALID_ARGUMENT, code 0x78000004 - generic error code for invalid arguments . Partial shape has dimension with no upper bounds: [1,1500,?]
Failed to create executable
andrei-kochin commented 2 weeks ago

@dmatveev @TolyaTalamanov should the NPU driver be updated from 32.0.100.2714 to some newer version?

weberwcwei commented 2 weeks ago

FYI, the NPU driver is updated to 32.0.100.3053, the chipset is U7 165U, and the python env is list in requirement.txt.

soumendukrg commented 1 week ago

Tested Whisper on NPU using nightly builds. Runs successfully. Noticed an error only max_new_tokens is 2048 and beyond.

soumendukrg commented 1 week ago

Exporting whisper using the latest optimum-intel added a new dynamic shape in decoder model input, which was static in previous versions. This is causing the issue reported by @weberwcwei.