Self-supervised Multi-view Multi-Human Association and Tracking,
Yiyang Gan, Ruize Han, Liqiang Yin, Wei Feng, Song Wang
Contact: realgump@tju.edu.cn. Any questions or discussions are welcomed!
Multi-human association and tracking (MHAT) with multi-view cameras, aims to track a group of people over time in each view, meanwhile, identify the same person across different views at the same time. This is a relatively new problem but has significance for multi-person scene video surveillance. Different from previous multiple object tracking (MOT) and multi-target multi-camera tracking (MTMCT) tasks, which only consider the over-time human association, multi-view MHAT requires to jointly achieve the cross-spatial-temporal data association. In this paper, we model this problem with a self-supervised learning framework and propose an end-to-end network to solve it. Specifically, we proposed a spatial-temporal association network with three designed self-supervised learning losses including self-similarity loss, transitive-similarity loss, and symmetrical-consistency loss, to simultaneously associate the human over time and across views. Besides, to promote the research on multi-view MHAT, we build a new large-scale benchmark for algorithm training and testing. Extensive experiments on the proposed datasets verify the effectiveness of our method.
The code was tested on Ubuntu 16.04, with Anaconda Python 3.6 and PyTorch v1.7.1. NVIDIA GPUs are needed for both training and testing. After install Anaconda:
conda create -n MVMHAT python=3.6
And activate the environment:
conda activate MVMHAT
conda install pytorch=1.7.1 torchvision -c pytorch
MVMHAT_ROOT=/path/to/clone/MVMHAT
git clone https://github.com/realgump/MvMHAT.git $MVMHAT_ROOT
pip install -r requirements.txt
cd $MVMHAT_ROOT/models
wget https://download.pytorch.org/models/resnet50-19c8e357.pth -O pretrained.pth
Link: Baidu Netdisk
Password:2cfh
Link: Baidu Netdisk
Password: 8sg9
Link: Baidu Netdisk
Password: jjaw
Link: OneDrive
Password: MvMHAT
If you find this project useful for your research, please use the following BibTeX entry.
@inproceedings{gan2021mvmhat,
title={Self-supervised Multi-view Multi-Human Association and Tracking},
author={Yiyang Gan, Ruize Han, Liqiang Yin, Wei Feng, Song Wang},
booktitle={ACM MM},
year={2021}
}
More information is coming soon ...