MengShen0709 / bmmal

[ACMMM 2023] BMMAL: Towards Balanced Active Learning for Multimodal Classification
https://arxiv.org/abs/2306.08306
Creative Commons Attribution 4.0 International
9 stars 1 forks source link

Balanced Multimodal Active Learning (ACMMM-2023)

This is official implementation for "Towards Balanced Active Learning for Multimodal Classification".

Code Tree

Supported Active Learning Strategy

Setup Anaconda

conda create -n mmal -m python=3.9
conda activate mmal
# we only test with pytorch version 1.13.1 and cuda 11.6
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
# install the rest dependencies using pip
pip install -r requirements.txt

Dataset

├── UPMC_Food101
    ├── train.json
    ├── test.json
    ├── images
        ├── train
            ├── label_name
                ├── label_name_id.jpg
                ├── ...
        ├── test
            ├── label_name
                ├── label_name_id.jpg
                ├── ...
    ├── texts_txt
        ├── label_name
            ├── label_name_id.txt
            ├── ...
├── kinetics_sound
    ├── my_train.txt
    ├── my_test.txt
    ├── train
        ├── video
            ├── label_name
                ├── vid_start_end
                    ├── frame_0.jpg
                    ├── frame_1.jpg
                    ├── ...
                    ├── frame_9.jpg
        ├── audio
            ├── label_name
                ├── vid_start_end.wav
                ├── ...
    ├── test
        ├── ...
├── vggsound
    ├── vggsound.csv
    ├── video
        ├── train
            ├── label_name
                ├── vid_start_end.avi
        ├── test
            ├── ...
    ├── frames
        ├── train
            ├── label_name
                ├── vid_start_end
                    ├── frame_0.jpg
                    ├── frame_1.jpg
                    ├── ...
                    ├── frame_9.jpg
        ├── test
            ├── ...
    ├── audio
        ├── train
            ├── label_name
                ├── vid_start_end.wav
        ├── test
            ├── ...

Run the Experiments

cd mmal
export PYTHONPATH=$PWD
python run/runner.py -s {strategy} --seed {random_seed} -c {config_file} -d {cuda_device_index} -r {al_iteration} 

Currently, we only support to run experiments on Single GPU card.

(Important) For fair comparison

To make sure each strategy begins with the same initialization, we highly recommend to start with the copy of the model trained with random sampling.

For example, if you want to examine performance of bmmal and badge:

# run the first iteration of active learning using random sampling
python run/runnner.py -s random --seed 1000 -c config/food101.yml -d 0 -r 0
# keep a copy of first iteration
cp -r logs/food101/food101-random-1000/version_0 logs/food101/food101-random-initialized-1000/version_0 
cp -r logs/food101/food101-random-1000/task_model.ckpt logs/food101/food101-random-initialized-1000/task_model.ckpt

# get a copy and renamed it as bmmal
cp -r logs/food101/food101-random-initialized-1000/version_0 logs/food101/food101-bmmal-1000/version_0 
cp -r logs/food101/food101-random-initialized-1000/task_model.ckpt logs/food101/food101-bmmal-1000/task_model.ckpt
# start bmmal sampling and training for second iteration
python run/runnner.py -s bmmal --seed 1000 -c config/food101.yml -d 0 -r 1

# get a copy and renamed it as badge
cp -r logs/food101/food101-random-initialized-1000/version_0 logs/food101/food101-badge-1000/version_0 
cp -r logs/food101/food101-random-initialized-1000/task_model.ckpt logs/food101/food101-badge-1000/task_model.ckpt
# start badge sampling and training for second iteration
python run/runnner.py -s badge --seed 1000 -c config/food101.yml -d 0 -r 1

By doing so, we can fairly compare the performance among different strategies with the same initialized iteration zero.

Monitor Active Learning in Tensorboard

Log File Tree

If we run active learning loop for 5 rounds, we will see version_0 to version_4 storing logging files for each round.

tensorboard --logdir logs/{logger_save_dir}

License

This repo is under CC BY 4.0 License. See LICENSE for details.

Cite

If you find this code helpful, please cite our paper:

@article{shen2023towards,
  title={Towards Balanced Active Learning for Multimodal Classification},
  author={Shen, Meng and Huang, Yizheng and Yin, Jianxiong and Zou, Heqing and Rajan, Deepu and See, Simon},
  journal={arXiv preprint arXiv:2306.08306},
  year={2023}
}