okankop / Efficient-3DCNNs

PyTorch Implementation of "Resource Efficient 3D Convolutional Neural Networks", codes and pretrained models.
MIT License
773 stars 149 forks source link

Efficient-3DCNNs

PyTorch Implementation of the article "Resource Efficient 3D Convolutional Neural Networks", codes and pretrained models.

Update!

3D ResNet and 3D ResNeXt models are added! The details of these models can be found in link.

Requirements

Pre-trained models

Pretrained models can be downloaded from here.

Implemented models:

Results

Dataset Preparation

Kinetics

python utils/kinetics_json.py train_csv_path val_csv_path video_dataset_path dst_json_path

Jester

python utils/n_frames_jester.py dataset_directory
python utils/jester_json.py annotation_dir_path

UCF-101

python utils/video_jpg_ucf101_hmdb51.py avi_video_directory jpg_video_directory
python utils/n_frames_ucf101_hmdb51.py jpg_video_directory
python utils/ucf101_json.py annotation_dir_path

Running the code

Model configurations are given as follows:

ShuffleNetV1-1.0x : --model shufflenet   --width_mult 1.0 --groups 3
ShuffleNetV2-1.0x : --model shufflenetv2 --width_mult 1.0
MobileNetV1-1.0x  : --model mobilenet    --width_mult 1.0
MobileNetV2-1.0x  : --model mobilenetv2  --width_mult 1.0 
SqueezeNet    : --model squeezenet --version 1.1
ResNet-18     : --model resnet  --model_depth 18  --resnet_shortcut A
ResNet-50     : --model resnet  --model_depth 50  --resnet_shortcut B
ResNet-101    : --model resnet  --model_depth 101 --resnet_shortcut B
ResNeXt-101   : --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32

Please check all the 'Resource efficient 3D CNN models' in models folder and run the code by providing the necessary parameters. An example run is given as follows:

Augmentations

There are several augmentation techniques available. Please check spatial_transforms.py and temporal_transforms.py for the details of the augmentation methods.

Note: Do not use "RandomHorizontalFlip" for trainings of Jester dataset, as it alters the class type of some classes (e.g. Swipe_Left --> RandomHorizontalFlip() --> Swipe_Right)

Calculating Video Accuracy

In order to calculate viceo accuracy, you should first run the models with '--test' mode in order to create 'val.json'. Then, you need to run 'video_accuracy.py' in utils folder to calculate video accuracies.

Calculating FLOPs

In order to calculate FLOPs, run the file 'calculate_FLOP.py'. You need to fist uncomment the desired model in the file.

Citation

Please cite the following article if you use this code or pre-trained models:

@inproceedings{kopuklu2019resource,
  title={Resource efficient 3d convolutional neural networks},
  author={K{\"o}p{\"u}kl{\"u}, Okan and Kose, Neslihan and Gunduz, Ahmet and Rigoll, Gerhard},
  booktitle={2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)},
  pages={1910--1919},
  year={2019},
  organization={IEEE}
}

Acknowledgement

We thank Kensho Hara for releasing his codebase, which we build our work on top.