-
### Motivation
CogVLM2 is now the SOTA open source VLM for captioning tasks.
### Related resources
_No response_
### Additional context
_No response_
-
### Your current environment
The output of `python collect_env.py`
```text
Network isolation, unable to download
Python3.8
8*A10 GPU
Model:InternVL2-26B
vllm …
-
Currently one of the best VLMs is EvoVLM-JP-v1-7B that just released this week. You can grab it here: https://huggingface.co/SakanaAI/EvoVLM-JP-v1-7B. It has code on the huggingface page to get it eas…
-
Hi, the loss explained in the paper is slightly different from the code
https://github.com/YiyangZhou/POVID/blob/5d55ce605230f5ad3889701a894a98ddca6e1534/tool/dpo_trainer.py#L616
I understand w…
-
Does this software support [CogVLM](https://huggingface.co/THUDM/CogVLM)?
-
Hi, I've been exploring this repo for the past couple of days and I find your work here really amazing. I'm curious if there are any plans to add support for the Phi-3-vision-128k-instruct model to th…
-
Hi all,
Thank you for this fantastic project and repository! I'm currently working with the REPL example to create an OpenVLA demo/walkthrough and wanted to suggest a couple of improvements:
Sep…
-
### Describe the issue as clearly as possible:
I am trying to run this together with the new Idefics 3 vision language model and am having trouble with that. Crossposting this as a comment in the I…
-
Should I classify your attack method as a gray-box attack? In your paper, you rarely mention how to choose the surrogate model. I think that surrogate model.should not be part of the target model I…
-
https://github.com/OpenGVLab/InternVL