visual-question-answering Search Results

paperswithcode/sota-extractor #159

The leaderboard is missing in https://paperswithcode.com/sot…

zhimin-z updated 2 days ago

huggingface/transformers #34681

Error loading model with device_map="auto" for AutoModelForV…

### System Info - `transformers` version: 4.44.2 - Platform: Windows-10-10.0.22631-SP0 - Python version: 3.9.13 - Huggingface_hub version: 0.24.7 - Safetensors version: 0.4.5 - Accelerate vers…

chakravarthik27 updated 2 weeks ago

huggingface/optimum #980

[New task] Visual Question Answering

### Feature request Currently, the [visual-question-answering pipeline/task](https://huggingface.co/tasks/visual-question-answering) in transformers is not supported for onnx export: https://githu…

xenova updated 4 months ago

paperswithcode/sota-extractor #40

The paper is missing from https://paperswithcode.com/sota/vi…

![image](https://github.com/user-attachments/assets/75ef8603-9cba-404c-9435-7a946e9bb2b0)

zhimin-z updated 2 days ago

LinWeizheDragon/Retrieval-Augmented-Visual-Question-Answering #51

Error

I always encounter this error in this path ("Retrieval-Augmented-Visual-Question-Answering/runway_for_ml/experiment.py") appears ([ERROR] - runway_for_ml.experiment : Uncaught exception: --> Trainer.…

qiankunlee updated 17 hours ago

open-compass/VLMEvalKit #257

[Request] Consider integrating the following list?

- [VQAv2](https://arxiv.org/pdf/1612.00837v3) - [TallyQA: Answering Complex Counting Questions](https://arxiv.org/pdf/1810.12440) - [GQA: A New Dataset for Real-World Visual Reasoning and Compos…

sarapieri updated 1 month ago

salesforce/BLIP #141

Visual Question Answering's confidence

Hello, How can we have the confidence of the visual question answering prediction? Thanks

kosarkazemi updated 1 year ago

HLR/Cross_Modality_Relevance #2

Some questions about visual question answering

Hi, It is great work and I'm following your work. I have some questions and hope you give me some solution. First, how to code "Entity Relevance" and "Relational Relevance" in VQA task, just use the "…

haoopan updated 2 years ago

BradyFU/Awesome-Multimodal-Large-Language-Models #191

Addition: GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 20…

Dear Authors, We'd like to add "GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning" to this repository, which has been accepted by NeurIPS 2024. [**Paper**](https:/…

WEIYanbin1999 updated 3 days ago

guanfuchen/cvpr_review #1

Graph-Structured Representations for Visual Question Answeri…

|id|title|author|year| |---|---|---|---| |1|Graph-Structured Representations for Visual Question Answering|Teney, Damien and Liu, Lingqiao and van den Hengel, Anton|2017|

guanfuchen updated 5 years ago

1000+ results for visual-question-answering

1000+ results
for visual-question-answering