yamy-cheng / DMAOT-VOTS2023

DMAOT ranked 1st in the VOTS 2023 challenge.
BSD 3-Clause "New" or "Revised" License
13 stars 3 forks source link

🏆VOTS 2023 Winner: DMAOT(Decoupled Memory AOT)

Intro

DMAOT ranked 1st in the VOTS 2023 challenge (leaderboard). As a plug-and-play method, DMAOT enhances the segmentation ability of AOT series algorithms in long-time videos without requiring any training process.

vots2023_certificate

Instance-wise long-term memories

We decouple the frame-wise long-term memory used in the AOT series frameworks and transform it into instance-wise long-term memory. This enhancement provides more precise control over the long-term memory bank of each individual, facilitating fine-grained memory management.

Instance-wise long-term memory bank

Dropout frame strategy based on cosine similarity.

We also utilize the dropout frame strategy based on cosine similarity when the maximum number of frames in the instance-wise long-term memory bank is reached. This strategy ensure each long-term memory bank have higher quality of memories.

dropout frame strategy based on cosine similarity

Prerequisites

Install python packages

Run tracker

Edit configuration files

Get results

bash evaluate.sh

Evaluate

Thanks

DMAOT are based on the AOT-Benchmark, which supports both AOT and DeAOT now. Thanks for such an excellent implementation.

Citations

Please consider citing the related paper(s) in your publications if it helps your research.

@inproceedings{yang2022deaot,
  title={Decoupling Features in Hierarchical Propagation for Video Object Segmentation},
  author={Yang, Zongxin and Yang, Yi},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2022}
}
@article{yang2021aost,
  title={Scalable Video Object Segmentation with Identification Mechanism},
  author={Yang, Zongxin and Wang, Xiaohan and Miao, Jiaxu and Wei, Yunchao and Wang, Wenguan and Yang, Yi},
  journal={arXiv preprint arXiv:2203.11442},
  year={2023}
}
@inproceedings{yang2021aot,
  title={Associating Objects with Transformers for Video Object Segmentation},
  author={Yang, Zongxin and Wei, Yunchao and Yang, Yi},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2021}
}
@inproceedings{kristan2023first,
  title={The first visual object tracking segmentation vots2023 challenge results},
  author={Kristan, Matej and Matas, Ji{\v{r}}{\'\i} and Danelljan, Martin and Felsberg, Michael and Chang, Hyung Jin and Zajc, Luka {\v{C}}ehovin and Luke{\v{z}}i{\v{c}}, Alan and Drbohlav, Ondrej and Zhang, Zhongqun and Tran, Khanh-Tung and others},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={1796--1818},
  year={2023}
}
@article{cheng2023segment,
  title={Segment and Track Anything},
  author={Cheng, Yangming and Li, Liulei and Xu, Yuanyou and Li, Xiaodi and Yang, Zongxin and Wang, Wenguan and Yang, Yi},
  journal={arXiv preprint arXiv:2305.06558},
  year={2023}
}

License

This project is released under the BSD-3-Clause license. See LICENSE for additional details.