Yi Pan (panyi_jsjy@nudt.edu.cn), Jun-Jie Huang* (jjhuang@nudt.edu.cn), Zihan Chen, Wentao Zhao, and Ziyue Wang (*corresponding author)
Pytorch implementation for "SVASTIN: Sparse Video Adversarial Attack via Spatio-Temporal Invertible Neural Networks" (ICME'2024).
Robust and imperceptible adversarial video attack is challenging due to the spatial and temporal characteristics of videos. The existing video adversarial attack methods mainly take a gradient-based approach and generate adversarial videos with noticeable perturbations. In this paper, we propose a novel Sparse Adversarial Video Attack via Spatio-Temporal Invertible Neural Networks (SVASTIN) to generate adversarial videos through spatio-temporal feature space information exchanging. It consists of a Guided Target Video Learning (GTVL) module to balance the perturbation budget and optimization speed and a Spatio-Temporal Invertible Neural Network (STIN) module to perform spatio-temporal feature space information exchanging between a source video and the target feature tensor learned by GTVL module. Extensive experiments on UCF-101 and Kinetics-400 demonstrate that our proposed SVASTIN can generate adversarial examples with higher imperceptibility than the state-of-the-art methods with the higher fooling rate.
Download dataset (UCF-101 and Kinetics-400) and prepare data by referring to mmaction2.
Models on kinetics-400 dataset are are all available in mmaction2 library. We consider three models (MVIT, SLOWFAST, and TSN). Except that, we fine-tune these models on UCF-101 dataset. You can find them in Google Drive.
You can run attack.py
directly.
1) attack.py
: Execute this file to attack
2) args.py
: Video and model parameters setting
3) config.py
: Hyperparameters setting
4) model/
: Architecture of Spatio-Temporal Invertible Neural Networks
6) checkpoints/
: Pre-trained model parameters
If you find this code and data useful, please consider citing the original work by authors:
@INPROCEEDINGS{10688258,
author={Pan, Yi and Huang, Jun-Jie and Chen, Zihan and Zhao, Wentao and Wang, Ziyue},
booktitle={2024 IEEE International Conference on Multimedia and Expo (ICME)},
title={SVASTIN: Sparse Video Adversarial Attack via Spatio-Temporal Invertible Neural Networks},
year={2024},
volume={},
number={},
pages={1-6},
keywords={Tensors;Codes;Perturbation methods;Neural networks;Streaming media;Optimization;Sparse Video Adversarial Attack;Invertible Neural Networks;Spatio-Temporal},
doi={10.1109/ICME57554.2024.10688258}
}
If you have any questions, please contact panyi_jsjy@nudt.edu.cn.