vqa Search Results - Githubissues

1000+ results
for vqa

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/torchtune #1704

Create multimodal instruction dataset builder

We current have `multimodal_chat_dataset` which is great for conversations on an image, but many VQA datasets are structured more like instructions where there is a question column, answer column, and…

RdoubleA updated 1 month ago
1
Blaizzy/mlx-vlm #84

Nan loss when training Llama-3.2-vision

## Issue I keep getting `nan` loss when training Llama-3.2-vision I tried: - gradient clipping - lower learning rate - higher batch size, lora rank and alpha But with no success. ## …

Blaizzy updated 1 month ago
11
ml-research/SLASH #1

Erro running VQA experiments

Hi, I encountered the following issue when I ran the `train.sh` file under `src/experiments/vqa/` : ```shell Traceback (most recent call last): File "train.py", line 24, in from dataGe…

Lyinggg updated 3 months ago
1
fulfulggg/Information-gathering #504

WorldCuisines: 世界の料理に関する多言語・多文化視覚質問応答の大規模ベンチマークデータセット

## タイトル: WorldCuisines: 世界の料理に関する多言語・多文化視覚質問応答の大規模ベンチマークデータセット ## リンク: https://arxiv.org/abs/2410.12705 ## 概要: 視覚言語モデル（VLM）は、特に英語以外の言語や、十分に表現されていない文化的背景において、文化固有の知識の理解に苦戦することが多いです。このような知識に対するVLMの理…

fulfulggg updated 1 month ago
2
salesforce/LAVIS #435

ocr training/evaluation of instructblip

Dear Maintainers, I'm currently trying to reproduce the zero-shot results of instructblip. The caption of table5 says that for datasets with OCR tokens, the image query embeddings are simply append…

gyhdog99 updated 3 weeks ago
2
FreedomIntelligence/HuatuoGPT-Vision #7

如何评估open question？

非常感谢你们的工作！请问你们是如何评估open question的，例如vqa-rad数据集。使用的是llava的prompt template吗？谢谢！

Lycus99 updated 1 week ago
2
intel/media-driver #1883

[Bug]: GST VA Video output of Color conversion - UYVY -> RGB…

### Which component impacted? Video Processing ### Is it regression? Good in old configuration? Yes, it's good in old version ### What happened? Corrupted output file. -------------- Reproducibl…

sengguan updated 17 hours ago
1
rentainhe/TRAR-VQA #5

VQA-2.0

Hello Author: I have recently reproduced your paper, and according to the data set you gave, it is 'number': 50.91 in vqa-2.0. 'other': 59.45, 'overall': 69.13, 'yes/no': 85.29}} The result is a …

erjpc updated 4 months ago
6
TinyLLaVA/TinyLLaVA_Factory #123

请问能做VQA任务吗

DeguangChen updated 1 month ago
2
OpenGVLab/video-mamba-suite #6

VQA code

Hello Thanks for your great work! Is the code of the video-mamba-suite on EgoSchema released?

EricLina updated 8 months ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for vqa

1000+ results
for vqa