Video Question Answering with Phrases via Semantic Roles
Arka Sadhu, Kan Chen Ram Nevatia
NAACL 2021
Video Question Answering has been studied through the lens of N-way phrase classification. While this eases evaluation, it severely limits its application in the wild. Here, we require the model to generate the answer and we propose a novel evaluation metric using relative scoring and contrastive scoring. We further create ActivityNet-SRL-QA and Charades-SRL-QA.
Clone repo:
git clone https://github.com/TheShadow29/Video-QAP
cd Video-QAP
export ROOT=$(pwd)
Setup a new conda environment using the file vidqap_env.yml file provided. Please refer to Miniconda for details on installing conda.
MINICONDA_ROOT=[to your Miniconda/Anaconda root directory]
conda env create -f vidqap_env.yml --prefix $MINICONDA_ROOT/envs/vidqap_pyt
conda activate vidqap_pyt
See instructions to install fairseq INSTALL.md
To download the datasets ActivityNet-SRL-QA and Charades-SRL-QA see DATA.md
cd $ROOT
python code/main_dist.py "vogqap_asrlqa" --ds_to_use='asrl_qa' --mdl.name='vog_qa' --train.bs=4 --train.epochs=10 --train.lr=1e-4
Use one of the models lqa, mtx_qa, butd_qa, vog_qa
vidqa_code/eval_fn_vidqap.py
. You can use this as a stand-alone file for a separate dataset as well.
cd $ROOT
python vidqa_code/eval_fn_vidqap.py --pred_file=... --ds_to_use='asrl_qa' --split_type='valid' --met_keys='meteor,rouge,bert_score'
ToDo:
We thank:
@inproceedings{Sadhu2021VideoQA,
title={Video Question Answering with Phrases via Semantic Roles},
author={Arka Sadhu and Kan Chen and R. Nevatia},
booktitle={NAACL},
year={2021}
}