-
Thank you for your great work! Please consider include MMEvol:
## Datasets of Multimodal Instruction Tuning
| Name | Paper | Link | Notes |
|:-----|:-----:|:----:|:-----:|
| **MMEvol** | [MMEvol…
-
Are you making this for your own purposes or to establish a prelude for Shen generally?
Should we create a unified repo under the Shen-Language org for standard libraries? Or add a `libs/` folder i…
-
Dear Authors,
We'd like to add "GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning" to this repository, which has been accepted by NeurIPS 2024. [**Paper**](https:/…
-
Hi,
Thanks for your efforts on such a valuable collection!
Could you please add the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate"?
M…
-
ChartAssistant is an advanced chart-based MLLM which is better than chartllama, It is accepted by ACL2024
https://github.com/OpenGVLab/ChartAst
-
@inproceedings{naturalbench,
title={NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples},
author={Li, Baiqi and Lin, Zhiqiu and Peng, Wenxuan and Nyandwi, …
-
Hello! Could you please add SALMONN series models?
Title | Venue | Date | Code | Demo
-- | -- | -- | -- | --
[SALMONN: Towards Generic Hearing Abilities for Large Language Models](https://arxiv.o…
-
IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model: we are the first work to propose visual instruction tuning with ID reference
-
I am having issues installing this solver. It appears that Cython is having problems compiling the cython files. Does anyone have a fix for this?
Here is what I ran
```
conda create --name spectr…
-
# URL
- https://arxiv.org/abs/2309.15025
# Affiliations
- Tianhao Shen, N/A
- Renren Jin, N/A
- Yufei Huang, N/A
- Chuang Liu, N/A
- Weilong Dong, N/A
- Zishan Guo, N/A
- Xinwei Wu, N/A
…