-
How to get **novel_vqa_val_known_question.json** and **novel_vqa_val_known_annotation.json**?
-
```
The configure line used in current ubuntu and debian packages of ffmpeg-damnvid
makes the binary unredistributable:
./configure --enable-memalign-hack --enable-libxvid --enable-libx264
--enable…
-
"When I use LLAVA to generate the corresponding captions, the speed is very slow, taking about one minute to complete the vqa_LLVA and vqa_LLVA_more_face_detail descriptions for a single image."
-
```
The configure line used in current ubuntu and debian packages of ffmpeg-damnvid
makes the binary unredistributable:
./configure --enable-memalign-hack --enable-libxvid --enable-libx264
--enable…
-
```
The configure line used in current ubuntu and debian packages of ffmpeg-damnvid
makes the binary unredistributable:
./configure --enable-memalign-hack --enable-libxvid --enable-libx264
--enable…
-
Thank you for your great work.
I found that groma performs poorly on some object detection tasks, which makes it difficult for me to determine whether the problem occurs in the inference phase or the…
-
# Baseline (updn, v2)
I run the following code:
```shell
CUDA_VISIBLE_DEVICES=0 python main.py --dataset v2 --mode updn --debias none --output v2_updn --seed 0
```
and get the following log:
`…
-
Thank you for sharing this code!
I am testing your code for multitask video with BART on 24GB GPUs.
To run your code on 24GB GPUs, I used below command to enable DDP. (batch size:50 -> 25)
bash…
-
The GPU resources and training time?
-
Thank you for your code! I would like to know if, in your implementation, the validation dataset was not used at all? Was the validation also conducted on the training dataset?