-
### System Info
- `transformers` version: 4.46.2
- Platform: Linux-5.15.0-120-generic-x86_64-with-glibc2.35
- Python version: 3.12.4
- Huggingface_hub version: 0.26.2
- Safetensors version: 0.4…
-
I can not understand why size=(batch_size, 577, 768), in vit , the size = (batch_size, 257, 1408), in vit and q-former , the size = (batch_size, 32, 768)
![image](https://github.com/user-attachments/…
-
Hello author, I am currently working on using IEMOCAP dataset with multi-label approach on your architecture, with audio, video and text as input. But I ran in some problems with your code, here are t…
-
您好,我在下载了您的项目之后,修改了minigpt4_vicuna0.yaml和/mnt/sda1/mateng/BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt/MiniGPT-4/minigpt4/configs/models/minigpt4_vicuna0.yaml里面的模型路径,但是执行VAP.py的…
-
### The model to consider.
Github code :https://github.com/Vision-CAIR/MiniGPT4-video
huggingface demo : https://huggingface.co/spaces/Vision-CAIR/MiniGPT4-video
huggingface package : https:…
-
-
Awesome project! But I found that minigpt4/configs/models/minigpt4_vicuna0.yaml does not exist. Please fix it. Thank you!
-
How to use MiniGPT4 for batch inference without using Chat model? I can't find a method to do so.
This is necessary for us to test on new datasets.
-
Hello author.
We would like to fine-tune the method proposed in your paper on a new dataset. I now have extracted features from audio and visual. We currently only need to implement the emotion reco…
-
Could you specify the exact model used for EgoSchema eval? The paper states that the LLM backbone used for EgoSchema eval is LLaMA-2, but README states that Vicuna weights were used. If LLaMA-2 was in…