LiVLR-VideoQA

We propose a Lightweight Visual-Linguistic Reasoning framework (LiVLR) for VideoQA. The overview of LiVLR:

Evaluation on MSRVTT-QA

Dataset

Download pre-extracted visual and linguistic features from here[d5iy].

Results on MSRVTT-QA

Download pre-trained models from here[yu9w].

Comparison with SoTA	Trainable parameters

Running the code

Install dependencies

conda create -n livlr_qa python=3.6
conda activate livlr_qa
conda install -c conda-forge ffmpeg
conda install -c conda-forge scikit-video
pip install -r requirements.txt

Training

CUDA_VISIBLE_DEVICES=1,2 python train.py --exp_name Exp_DiVS/all

Evaluation

Setting the correct file path and run the following code:

python test.py

jingjing12110 / LiVLR-VideoQA

readme

LiVLR-VideoQA

Evaluation on MSRVTT-QA

Dataset

Results on MSRVTT-QA

Running the code

Install dependencies

Training

Evaluation