Open acane77 opened 2 days ago
We found this model can be compiled in openvino 2024.3.0, but cannot be compiled in openvino 2024.5.0-20240913.
Here is a minimal sample:
#include "openvino/core/core.hpp"
#include "openvino/openvino.hpp"
#include <string>
int main() try {
std::string model_path = "D:/Models/InternVL2-4B-int4-openvino";
const char* openvino_model_xml = "openvino_model.xml";
ov::Core core;
auto model = core.read_model(model_path + "/" + openvino_model_xml);
auto compiled_model = core.compile_model(model, "GPU");
auto request = compiled_model.create_infer_request();
}
catch (const std::exception& error) {
try {
std::cerr << error.what() << '\n';
} catch (const std::ios_base::failure&) {
}
return EXIT_FAILURE;
} catch (...) {
try {
std::cerr << "Non-exception object thrown\n";
} catch (const std::ios_base::failure&) {
}
return EXIT_FAILURE;
}
This seems to be a known functional regression. Dev team is planning to fix this issue shortly.
Hi, issue is most likely because of this commit
https://github.com/openvinotoolkit/openvino/commit/90d1219c98fb2fdcb7448f1d18b25a370efd7ccf
Solution would be to fall back to older behavior when stoi fails (items_num depends on dynamic axis)
Therefore, apply only stack size heuristic when items_num can not be known at build time.
PR #26703 fixes the issue
OpenVINO Version
2024.5.0dev20240913
Operating System
Windows System
Device used for inference
GPU
Framework
PyTorch
Model used
Phi3
Issue description
Couldn't run Phi3 model with GPU. It can run normally on CPU.
This phi3 model is extracted and converted from IntenVL2-4B(https://huggingface.co/OpenGVLab/InternVL2-4B) model with this script: https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/internvl2/internvl2.ipynb
Then report
Can't choose implementation for rms:__module.model.layers.0.input_layernorm/aten::mul/Multiply_1_compressed_to_f16
.Step-by-step reproduction
Relevant log output
Issue submission checklist