-
Hi team,
I'd be interested to see whether we could add the [MobileCaptureVQA](https://huggingface.co/datasets/arnaudstiegler/mobile_capture_vqa) dataset on this benchmark.
This VQA dataset focused…
-
# ComfyUI Error Report
## Error Details
- **Node Type:** Qwen2_VQA
- **Exception Type:** TypeError
- **Exception Message:** can only concatenate str (not "list") to str
-
Hi! First of all, congrats for the awesome work.
I wanted to play around with the pretrained model in the downstream task of report generation. Given that the pretrained weights of MedViLL are avai…
-
Thank you for the incredible set of repositories (this one and prismatic-vlms), it has been a great joy using them. Very well-designed, configurable, and easy to use for researchers.
I'm running in…
-
While running a code, we are receiving below error. Please help me out.
OSError Traceback (most recent call last)
in ()
85
86 if __name__ == "__main…
-
I am trying to run the inference of the model for infographic vqa task. The instruction mention the cli command for a dummy task and is as follows:
python -m pix2struct.example_inference \
--gin_…
-
Posting to gauge/express interested in MiniGPT-v2 support being added.
-
We keep this issue open to collect feature requests from users and hear your voice. Our monthly release plan is also available here.
You can either:
1. Suggest a new feature by leaving a comment…
-
### Metadata
- Authors: Kushal Kafle, Scott Cohen, +1 author Christopher Kanan
- Organization: Rochester Institute of Technology & Adobe Research
- Conference: CVPR 2018
- Paper: https://arxiv.org…
-
## Issue
I keep getting `nan` loss when training Llama-3.2-vision
I tried:
- gradient clipping
- lower learning rate
- higher batch size, lora rank and alpha
But with no success.
## …