openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
6.82k stars 2.18k forks source link

[Bug]: aten::pad_sequence is not supported #25976

Open ziyanxzy opened 1 month ago

ziyanxzy commented 1 month ago

OpenVINO Version

2024.4.0-16283-41691a36b90

Operating System

Windows System

Device used for inference

CPU

Framework

PyTorch

Model used

https://hf-mirror.com/openbmb/MiniCPM-V-2_6

Issue description

when i try to convert minicpmv2.6 (image_encoder), it report: Summary: -- No conversion rule found for operations: aten::pad_sequence -- Conversion is failed for: prim::ListConstruct

Step-by-step reproduction

  1. get minicpmv2.6 (vpm+resamper)
  2. convert to ov model

Relevant log output

File "C:\Users\SAS\miniforge3\envs\mini\lib\site-packages\openvino\frontend\frontend.py", line 18, in convert
    converted_model = super().convert(model)
openvino._pyopenvino.OpConversionFailure: Check 'is_conversion_successful' failed at src/frontends/pytorch/src/frontend.cpp:171:
FrontEnd API failed with OpConversionFailure:
Model wasn't fully converted. Failed operations detailed log:
-- prim::ListConstruct with a message:
Exception happened during conversion of operation __module.resampler/prim::ListConstruct with schema (no schema)
Check '(c_node)' failed at src/frontends/pytorch/src/op/list_construct.cpp:25:
FrontEnd API failed with OpConversionFailure:
[PyTorch Frontend] Translation for prim::ListConstruct support only constant inputs

Summary:
-- No conversion rule found for operations: aten::pad_sequence
-- Conversion is failed for: prim::ListConstruct

Python 3.10.0 | packaged by conda-forge | (default, Nov 20 2021, 02:18:13) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import openvino
>>> openvino.__version__
'2024.4.0-16283-41691a36b90'

Issue submission checklist

eaidova commented 3 weeks ago

@ziyanxzy , at this moment, ther is no possibility to cover this operation in openvino with preserving dynamic shapes. Analizing model, I can say that this is part is problematic also for model tracing in torchscript too (it uses constant folding for resolve cycle in resampler forward and as the result, model can not process images with different from shapes used during tracing, number of patches and image sizes. I prepared reference code for model conversion and inference in openvino in this notebook https://github.com/openvinotoolkit/openvino_notebooks/pull/2302

Could you please check?