-
VLMEVALKIT is a pretty convenient evaluation tool for MLLMs. I hope that the esteemed authors can create a framework for VLMEVALKIT that supports the evaluation of custom models and custom datasets. T…
-
部署internvl2之后,参考这里的访问方式 https://github.com/modelscope/ms-swift/blob/main/docs/source/Multi-Modal/MLLM%E9%83%A8%E7%BD%B2%E6%96%87%E6%A1%A3.md#minicpm-v-v2_5-chat 当我通过ip+端口访问时,能够正常返回结果,但是通过域名访问时,会报如下错误
…
-
Thanks for your great work~
I have some difficulties when reading the source code
Q1: The original version of RECON support the multimodal input(pts, img, text), but the RECON++ seems don't suppor…
-
Here is my prompt
Induce the concept from the in-context examples. Answer the question with a single word or phase.
We name this is a slation
We name this is a dax
Based on the in-context ex…
-
### News: 아 지난 주 너무 힘든.....
- [지난 주 못했던 내용 (죄송합니다)](https://github.com/jungwoo-ha/WeeklyArxivTalk/issues/75)
- Conferences
- ICML 2023 리뷰, ACL 2023 리뷰 나왔네요 --> 모두들 파이팅!
- ICCV 2023 Supplementa…
-
### 🚀 The feature, motivation and pitch
InternVL2 is currently the most powerful open-source Multimodal Large Language Model (MLLM). The InternVL2 family includes models ranging from a 2B model, suit…
-
# Description
First, thank you for such a great Open contribution. :clap:
Can't run the demo as stated in `README.md`. I get this error :pensive:
```python
Traceback (most recent call last):…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
None
### Reproduction
None
### Expected behavior
None
### Others
None
-
### Reminder
- [X] I have read the README and searched the existing issues.
### Reproduction
Hi There,
I am trying to run inference of MLLM models using the `CUDA_VISIBLE_DEVICES=0 llamafactory-cl…
anidh updated
3 months ago
-
Following this issue https://github.com/InternLM/InternLM-XComposer/issues/404
For some reason, we I feed **text only** batch with per_gpu_batch =1 and batch_size = 5, the modeling.py only accepts on…