-
Hi, how can I obtain VQA pairs for the DUDE dataset?
I searched for DUDE_loader on HuggingFace, but it seems they only have OCR, images, and PDFs for documents.
-
I tried to run the training code on both datasets, but both were evaluating that line of code for errors.
Error 205 line code: vqa_result = evaluation(model_without_ddp, test_loader, device, config…
-
thanks for the great work. I was trying to reproduce your code, I noticed during pretraining, if you set the `mm_vision_output_token_count = 576` you will get:
```
File "llava-token-compression/ll…
-
Hello, this error occurred when I wanted to train the model. How can I download this pth file?
/home/jinw/anaconda3/envs/KVQ/lib/python3.8/site-packages/torch/functional.py:504: UserWarning: torc…
ads2d updated
5 months ago
-
Hi authors, thank you for your great paper as well as your public code. I want to know when your code, especially on VQA tasks, will be fully published?
-
Is there any code to support grounded VQA?
-
@Neerajj9 In the following python code, https://github.com/Neerajj9/Stacked-Attention-Networks-for-Visual-Question-Answering/blob/master/save_QA.py, at line number 102, where can I find the vqa_final_…
-
Hi! I have some questions about the files preprocessed from the dataset VQA-CP v2 and used in your work. Because VQA-CP v2 only has the training and test questions and annotations, how to generate fil…
-
### Question
As described by the paper, I see there many phases for VisWiz benchmark to submit. Could anyone point out the exact phase we should submit?
![image](https://github.com/user-attachme…
-
Hi team,
I'd be interested to see whether we could add the [MobileCaptureVQA](https://huggingface.co/datasets/arnaudstiegler/mobile_capture_vqa) dataset on this benchmark.
This VQA dataset focused…