openvinotoolkit / openvino_notebooks

📚 Jupyter notebook tutorials for OpenVINO™
Apache License 2.0
2.41k stars 810 forks source link

Qwen2-vl unable to run #2393

Closed afreedizDB closed 1 month ago

afreedizDB commented 1 month ago

Unable to Run Qwen2-VL Model

I have successfully converted the model to the intermediate representation as per the guide provided in the OpenVINO Notebooks repository.

However, when I try to load the model with the following line:

model = OVQwen2VLModel(model_dir, device.value)

I encounter the following error:

model = OVQwen2VLModel(model_dir, device.value)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Afree\OneDrive\Desktop\qwen2-vl\ov_qwen2_vl.py", line 394, in __init__
    self.model = core.read_model(model_dir / LANGUAGE_MODEL_NAME)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Afree\AppData\Local\Programs\Python\Python312\Lib\site-packages\openvino\runtime\ie_api.py", line 502, in read_model
    return Model(super().read_model(model))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Exception from src\inference\src\cpp\core.cpp:90:
Check 'util::directory_exists(path) || util::file_exists(path)' failed at src\frontends\common\src\frontend.cpp:117:      
FrontEnd API failed with GeneralFailure:
ir: Could not open the file: "Qwen2-VL-2B-Instruct\openvino_language_model.xml"

Of course, there is no file named openvino_language_model.xml in the directory.

brmarkus commented 1 month ago

You might need to be a bit more patient... I just tried to reproduce this notebook - and conversion (and compression NNCF) took more than 20 minutes on my machine. But it finally was created (occupying the complete 64GB system memory for more than 20 minutes): image

afreedizDB commented 1 month ago

Ok, when i tried this using jupyter notebook it was showing my kernal has died, hence i made a python file and tried to run the conversion in it, and it successfully converted ( took some minutes to complete ) and these files was saved in my local except the openvino_language_model.bin and openvino_language_model.xml

from pathlib import Path
from ov_qwen2_vl import model_selector
from ov_qwen2_vl import convert_qwen2vl_model
import requests
import nncf

model_id = model_selector()
print(f"Selected {model_id.value}")
pt_model_id = model_id.value
model_dir = Path(pt_model_id.split("/")[-1])

compression_configuration = {
    "mode": nncf.CompressWeightsMode.INT4_ASYM,
    "group_size": 128,
    "ratio": 1.0,
}

convert_qwen2vl_model(pt_model_id, model_dir, compression_configuration)
brmarkus commented 1 month ago

Depending on how powerful your machine is and how much system memory (and maybe even need to swap to HDD/SDD as conversion and compression takes A LOT OF system memory) it could really take more than a few minutes.

You might want to have a look into "ov_qwen2_vl.py" (line 320) and check if you saw console log messages like:

    if not lang_model_path.exists():
        print("⌛ Convert Language model")

and print("✅ Language model successfully converted")

and finally print(f"✅ {model_id} model conversion finished. You can find results in {output_dir}")

If from a previous run (the Jupyter notebook kernel has died or at least was not responsive as it was very busy) you might need to manually delete the subfolder Qwen2-VL-2B-Instruct (or whatever model you have selected from the drop-downl list) with the previously downloaded and maybe incompletely converted&compressed model files.

afreedizDB commented 1 month ago

thanks a lot, i identified my issue. the issue is in the section of code of line number 305 in ov_qwen2_vl.py :

            ov_model = ov.convert_model(
                vision_embed_tokens,
                example_input={
                    "hidden_states": torch.randn([4988, 1280]),
                    "attention_mask": torch.ones([1, 4988, 4988]),
                    "rotary_pos_emb": torch.randn([4988, 40]),
                },
            )

upto here everyting is working fine, but at the model conversion ( line 305 ) it just hang fo a minute and exit, no codes after ths line is getting executed. when i checked the, the lang_model and image_embed_merger does not exist

brmarkus commented 1 month ago

On my machine (Laptop, Intel Core Ultra 7 155H with 64GB system memory) the whole conversion and compression of all used models "hang" for 22 minutes and I see all 64GB memory is being used and the HDD is used to swap additional memory.

Do you have a lot of system memory and disk space left for swapping? (I don't know how Python reacts to "no more memory/storage left).

afreedizDB commented 1 month ago

Issue solved. Thank you

brmarkus commented 1 month ago

Can you share what has solved the issue? Is there something the notebook could be improved with?

afreedizDB commented 1 month ago

Nothing wrong with the notebook. The convertion was taking a lot of memory, so i used a server for the convertion which worked.

brmarkus commented 1 month ago

That is an important information. The README could contain a hint about (much) higher system memory requirements...

See e.g. "https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-chatbot" containing this:

Note: conversion of some models can require additional actions from user side and at least 64GB RAM for conversion.

On my machine with 64GB the conversion worked, but using A LOT of swapped memory to SSD - and took VERY VERY long.