-
Could you add our CVPR 2024 paper about vision-language pertaining, "Iterated Learning Improves Compositionality in Large Vision-Language Models", into this repo?
Paper link: https://arxiv.org/abs/…
-
- While the Latex env is not fully set, we'll write our thoughts here for now
----
- `ViperGPT` is a framework that leverages the pre-trained vision language models (`GLIP` for image object ground…
-
同学你好,非常感谢你对这一系列论文的整理和梳理,真的帮助很大!在阅读文献时注意到,仓库中部分标注为“2024-NeurIPS”的论文是“2023-NeurIPS”。以下是我发现的相关论文列表,供参考:
2023-NeurIPS:[Enhancing Adversarial Contrastive Learning via Adversarial Invariant Regularizatio…
-
Dear authors,
@shuyansy @UnableToUseGit
I kindly think you need to discuss VoCo-LLaMA[1] in the "Intro" section of your paper at the very least.
As I find the citation and discussions related to …
-
Dear shikiw,
Thank you for your valuable effort in curating research on MLLM hallucination! This excellent repository is impressively comprehensive and provides researchers with a clear sense of th…
-
*Sent by Google Scholar Alerts (scholaralerts-noreply@google.com). Created by [fire](https://fire.fundersclub.com/).*
---
###
###
### [PDF] [Attention Prompting on Image for Large Vision-Language…
-
Hi,
Thanks for your efforts on such a valuable collection!
Could you please add the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate"?
M…
-
- Here's the summary of consulting a LLM specialist:
---
- We have an initial thought in #74 as follows:
![image](https://github.com/user-attachments/assets/265a3d7d-0454-4e7b-9c99-a0dd9f9ecf7c…
-
### Motivation
Recently,there are many good paper that try to alleviating hallucinations for large vision-language models **during the decode process**,like:
OPERA: Alleviating Hallucination in Mu…
zhly0 updated
2 months ago
-
This is a master issue to track all items related to the November 1st MultiNet Release. The motivation & scoping for this release is below. We follow w/ the specific issues being tracked with specific…