AlphAction aims to detect the actions of multiple persons in videos. It is the first open-source project that achieves 30+ mAP (32.4 mAP) with single model on AVA dataset.
This project is the official implementation of paper Asynchronous Interaction Aggregation for Action Detection (ECCV 2020), authored by Jiajun Tang, Jin Xia (equal contribution), Xinzhi Mu, Bo Pang, Cewu Lu (corresponding author).
You need first to install this project, please check INSTALL.md
To do training or inference on AVA dataset, please check DATA.md
for data preparation instructions. If you have difficulty accessing Google Drive, you can instead find most files (including models) on Baidu NetDisk([link], code: smti
).
Please see MODEL_ZOO.md for downloading models.
To do training or inference with AlphAction, please refer to GETTING_STARTED.md.
To run the demo program on video or webcam, please check the folder demo. We select 15 common categories from the 80 action categories of AVA, and provide a practical model which achieves high accuracy (about 70 mAP) on these categories.
We thankfully acknowledge the computing resource support of Huawei Corporation for this project.
If this project helps you in your research or project, please cite this paper:
@inproceedings{tang2020asynchronous,
title={Asynchronous Interaction Aggregation for Action Detection},
author={Tang, Jiajun and Xia, Jin and Mu, Xinzhi and Pang, Bo and Lu, Cewu},
booktitle={Proceedings of the European conference on computer vision (ECCV)},
year={2020}
}