hila-chefer / Transformer-MM-Explainability

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
MIT License
801 stars 107 forks source link

The data set downloaded automatically is too large #15

Closed Shuai-Lv closed 2 years ago

Shuai-Lv commented 2 years ago

Hi~ I want to reproduce your results to follow your work, the problem is when I execute the following script:

CUDA_VISIBLE_DEVICES=0 PYTHONPATH=pwd python VisualBERT/run.py --method= --is-text-pert=<true/false> --is-positive-pert=<true/false> --num-samples=10000 config=projects/visual_bert/configs/vqa2/defaults.yaml model=visual_bert dataset=vqa2 run_type=val checkpoint.resume_zoo=visual_bert.finetuned.vqa2.from_coco_train env.data_dir=/path/to/data_dir training.num_workers=0 training.batch_size=1 training.trainer=mmf_pert training.seed=1234

the colab will download the big dataset which reach 104G size.

I'm asking if I did the right thing. If so, can I have an alternative to downloading such a large data set? How to execute the script correctly if it is wrong. Thanks for your reading~

hila-chefer commented 2 years ago

Hi @Shuai-Lv, thanks for your interest! This code is based on the code of the MMF library by Facebook research. You could probably settle for downloading only the validation set to save some space, as it is enough to run the experiments from the paper.

I hope this helps. Best, Hila