huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.34k stars 27.09k forks source link

GroundingDINO cannot work with MiniGPT4 #34885

Open pspdada opened 18 hours ago

pspdada commented 18 hours ago

System Info

Who can help?

@zucchini-nlp @amyeroberts

Information

Tasks

Reproduction

When I use the minigpt4 model from the repository ·https://github.com/Vision-CAIR/MiniGPT-4·, I find that grounding dino cannot be used together with it.

Specifically, when I import some necessary content from the minigpt4 repository into my project (without doing anything else about the minigpt4 repo) and use transformers grounding dino model, dino crashes the program directly at the model(**encoded_inputs) call with an error code of SIG(117), and no traceback or other information is provided.

Other models, such as flan-t5-base-VG-factual-sg, do not crash during their forward pass even when minigpt4 is imported.

After commenting out the four import lines related to minigpt4, there are no issues anymore.

import torch
from PIL import Image
from transformers import (
    GroundingDinoForObjectDetection,
    GroundingDinoProcessor,
)

# imports modules for registration
from minigpt4.datasets.builders import *  # noqa
from minigpt4.models import *  # noqa
from minigpt4.processors import *  # noqa
from minigpt4.tasks import *  # noqa

image_path = "/root/llm-project/LVLM/eval/Extended_CHAIR/images/chair-500/000000006763.jpg"
image: Image.Image = Image.open(image_path)
model: GroundingDinoForObjectDetection = (
    GroundingDinoForObjectDetection.from_pretrained(
        "IDEA-Research/grounding-dino-base",
        cache_dir="/root/llm-project/utils/models/hub",
        torch_dtype="auto",
        low_cpu_mem_usage=True,
    )
    .to("cuda")
    .eval()
)

processor: GroundingDinoProcessor = GroundingDinoProcessor.from_pretrained(
    "IDEA-Research/grounding-dino-base",
    cache_dir="/root/llm-project/utils/models/hub",
)

text = "man.umbrella.top hat."

with torch.inference_mode():
    encoded_inputs = processor(
        images=image,
        text=text,
        max_length=200,
        return_tensors="pt",
        padding=True,
        truncation=True,
    ).to("cuda")
    outputs = model(**encoded_inputs) # Crash here
    results = processor.post_process_grounded_object_detection(
        outputs=outputs,
        input_ids=encoded_inputs["input_ids"],
        box_threshold=0.25,
        text_threshold=0.25,
    )
    print(results)

Expected behavior

Since this issue is related to other repositories, I would like to ask if you can help resolve this problem? Or kindly just guide me on how to find the deeper cause? Combining multiple models is significant for my project, but this issue does not provide any traceback, leaving me without a starting point.

qubvel commented 16 hours ago

Hi @pspdada, thanks for reporting the issue! Does it work fine without any imports from minigpt4 in your env?

pspdada commented 10 hours ago

Hi @pspdada, thanks for reporting the issue! Does it work fine without any imports from minigpt4 in your env?

Yes