mllm Search Results - Githubissues

480 results
for mllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

DingchenYang99/Pensieve #2

Consufion of the "compare visual concepts"

The idea of this work is very interesting! While I have two confusions about the method: (1) What's the ground truth caption of the image in Fig. 2? Is the word "feather" correct? (I am not sure…

QiushiYang updated 4 months ago
2
OpenGVLab/Ask-Anything #144

34b large language model hermes2_yi34b

Recently, som MLLMs hava adapted hermes2_yi34b as base language model, such as [InternVL](hermes2_yi34b), [LLava](https://github.com/haotian-liu/LLaVA) . Have your team applied it to the project, lik…

hn18001 updated 5 months ago
5
tsb0601/MMVP #8

The accuracy of LLaVA-1.5-7b with CLIP encoder is 60.0 on MM…

I evaluated LLaVA-1.5-7b on the MMVP dataset and found that its accuracy is 60.0%, which is significantly higher than the 24.7% reported in Table 3. Upon comparing the evaluation code, I discovered t…

Richar-Du updated 5 months ago
3
open-compass/VLMEvalKit #101

Detailed results of ScienceQA-IMG

Thanks for the great effort of this repo! I see you provide the zero-shot results of several MLLMs on ScienceQA-IMG dataset. Could you please add the detailed results (i.e., NAT, SOC, LAN) of the TEST…

thecharm updated 5 months ago
3
THUDM/CogVLM #401

Query pdf files instead of images

### Feature request / 功能建议 Dear CogVLM's authors, Thank you for your outstanding work on MLLM. In the demo, we can only query pictures. Is it possible to make the model process pdf files? ### Mot…

moncefarajdal updated 5 months ago
1
modelscope/ms-swift #1133

GLM4v LORA微调后，断点训练失败

训练脚本 ``` NPROC_PER_NODE=4 \ CUDA_VISIBLE_DEVICES=0,1,2,3 \ swift sft \ --model_type glm4v-9b-chat \ --model_id_or_path /MLLM/new_models/ZhipuAI/glm-4v-9b \ --dataset /data_archive…

lyc728 updated 1 week ago
8
NExT-ChatV/NExT-Chat #14

Question for masked_loss

Hi, thank you for your implementation. While I'm viewing your code lines, a question arises about the 'masked loss.' Why do you mask out the last part of each loss using this function? https:…

planemanner updated 6 months ago
2
bfshi/scaling_on_scales #4

Num of tokens in LLaVA

Hi, thank you for this great work! In Table 1 of your paper, accuracy improvement is reported by adding S2 Scaling to LLaVA. As shown in Figure 1, the channel dimension of S2 Scaling is double …

RussRobin updated 5 months ago
3
TencentQQGYLab/ELLA #4

Will the training data be open souced?

liming-ai updated 6 months ago
2
kongds/E5-V #2

在Flickr30K上复现模型效果

作者你好，我在尝试使用e5-v, 在长文本检索的场景中，确实看到比较好的效果，但我在尝试复现时，发现效果和论文中没对齐。实验实在Flickr30K上进行实验的，以下时实验结果 ### 开放的权重e5-v 测试效果如下： 'image_retrieval_recall@1 | 'image_retrieval_recall@5 | 'image_retrieval_recall@10 -…

saicoco updated 1 month ago
14

上一页 1...33 34 35 36 37 38 39...48 下一页

480 results for mllm

480 results
for mllm