This is an official repo for ICCV2023 Toward Unsupervised Realistic Visual Question Answering, which encourage the model to answer the answerable questions and reject the un-answerable ones (See figure below). See Paper for details.
git clone https://github.com/chihhuiho/RGQA.git
Might need to install the different pytorch and torchvision version based on your device.
conda env create -f environment.yml
cd data
sh download_rgqa.sh
|-- data
|-- butd
|-- download_rgqa.sh
|-- gqa
|-- lxmert
|-- mscoco_imgfeat
|-- nlvr2
|-- nlvr2_imgfeat
|-- vg_gqa_imgfeat
|-- vqa
To encourage future research on RVQA, we provide a script to evaluate the proposed dataset using the proposed metric (e.g. AUAF, FF95 and FACC). As long as the model prediction is organized as the provided example, the script can be used to compute the model performance. Below are the steps.
cd compute_accfpr
python compute_accfpr.py
Below are the script for 3 backbones, including lxmert, butd and uniter. Simply change the ``BACKBONE" to the desired backbone in the following command.
Finetune vanilla GQA with BACKBONE
sh scripts/BACKBONE/train/vanilla.sh 0
Finetune BACKBONE with random pairing Pseudo UQ (RP) on GQA
sh scripts/BACKBONE/train/rp.sh 0
Finetune BACKBONE with hard Pseudo UQ (RP) on GQA
sh scripts/BACKBONE/train/rp_with_hard_uq.sh 0
Finetune BACKBONE with mixup RoI on GQA
sh scripts/BACKBONE/train/mix.sh 0
The pretrained weight for different RVQA approahces can be download using the following code (about 8GB).
cd snap/gqa
sh download_rgqa_ckpt.sh
BACKBONE with FRCNN
sh scripts/BACKBONE/test/frcnn.sh $GPUID
sh scripts/BACKBONE/test/frcnn.sh 0
The result is located in snap/gqa/BACKBONE/test/frcnn" and it should be identical to
snap/gqa/pretrain/BACKBONE/frcnn"
BACKBONE with MSP
sh scripts/BACKBONE/test/msp.sh $GPUID
sh scripts/BACKBONE/test/msp.sh 0
The result is located in snap/gqa/BACKBONE/test/msp" and it should be identical to
snap/gqa/pretrain/BACKBONE/msp"
BACKBONE with ODIN
sh scripts/BACKBONE/test/odin.sh $GPUID
sh scripts/BACKBONE/test/odin.sh 0
The result is located in snap/gqa/BACKBONE/test/odin" and it should be identical to
snap/gqa/pretrain/BACKBONE/odin"
BACKBONE with Maha
sh scripts/BACKBONE/test/maha.sh $GPUID
sh scripts/BACKBONE/test/maha.sh 0
The result is located in snap/gqa/BACKBONE/test/maha" and it should be identical to
snap/gqa/pretrain/BACKBONE/maha"
BACKBONE with Energy
sh scripts/BACKBONE/test/energy.sh $GPUID
sh scripts/BACKBONE/test/energy.sh 0
The result is located in snap/gqa/BACKBONE/test/energy" and it should be identical to
snap/gqa/pretrain/BACKBONE/energy"
BACKBONE with Q-C
sh scripts/BACKBONE/test/qc.sh $GPUID
sh scripts/BACKBONE/test/qc.sh 0
The result is located in snap/gqa/BACKBONE/test/qc" and it should be identical to
snap/gqa/pretrain/BACKBONE/qc"
BACKBONE with resample
sh scripts/BACKBONE/test/resample.sh $GPUID
sh scripts/BACKBONE/test/resample.sh 0
The result is located in snap/gqa/BACKBONE/test/resampling" and it should be identical to
snap/gqa/pretrain/BACKBONE/resampling"
BACKBONE with RP with only hardUQ
sh scripts/BACKBONE/test/rp_with_harduq.sh $GPUID
sh scripts/BACKBONE/test/rp_with_harduq.sh 0
The result is located in snap/gqa/BACKBONE/test/RP_with_hard_uq" and it should be identical to
snap/gqa/pretrain/BACKBONE/RP_with_hard_uq"
BACKBONE with RP
sh scripts/BACKBONE/test/rp.sh $GPUID
sh scripts/BACKBONE/test/rp.sh 0
The result is located in snap/gqa/BACKBONE/test/RP" and it should be identical to
snap/gqa/pretrain/BACKBONE/RP"
BACKBONE with Mixup
sh scripts/BACKBONE/test/mixup.sh $GPUID
sh scripts/BACKBONE/test/mixup.sh 0
The result is located in snap/gqa/BACKBONE/test/mixup" and it should be identical to
snap/gqa/pretrain/BACKBONE/mixup"
BACKBONE with Ensemble
sh scripts/BACKBONE/test/ensemble.sh $GPUID
sh scripts/BACKBONE/test/ensemble.sh 0
The result is located in snap/gqa/BACKBONE/test/ensemble" and it should be identical to
snap/gqa/pretrain/BACKBONE/ensemble"
Test all RVQA approaches with BACKBONE
sh scripts/BACKBONE/test/test_all.sh $GPUID
sh scripts/BACKBONE/test/test_all.sh 0
The repo uses the code and checkpoint from