AsuradaYuci / TF-CLIP

TF-CLIP: Learning Text-Free CLIP for Video-Based Person Re-identification (AAAI2024)
MIT License
31 stars 3 forks source link
Framework

Large-scale language-image pre-trained models (e.g., CLIP) have shown superior performances on many cross-modal retrieval tasks. However, the problem of transferring the knowledge learned from such models to video-based person re-identification (ReID) has barely been explored. In addition, there is a lack of decent text descriptions in current ReID benchmarks. To address these issues, in this work, we propose a novel one-stage text-free CLIP-based learning framework named TF-CLIP for video-based person ReID.

:loudspeaker:News

:fire: Highlight

:memo: Results

:bookmark_tabs:Installation

:car:Run TF-CLIP

For example,if you want to run method on MARS, you need to modify the bottom of configs/vit_base.yml to

DATASETS:
   NAMES: ('MARS')
   ROOT_DIR: ('your_dataset_dir')
OUTPUT_DIR: 'your_output_dir'

Then, run

CUDA_VISIBLE_DEVICES=0 python train-main.py

:car:Evaluation

For example, if you want to test methods on MARS, run

CUDA_VISIBLE_DEVICES=0 python eval-main.py

:hearts: Acknowledgment

This project is based on CLIP-ReID and XCLIP. Thanks for these excellent works.

:hearts: Concat

If you have any questions, please feel free to send an email to yuchenyang@mail.dlut.edu.cn or asuradayuci@gmail.com. .^_^.

:book: Citation

If you find TF-CLIP useful for you, please consider citing :mega:

@article{tfclip,
      Title={TF-CLIP: Learning Text-Free CLIP for Video-Based Person Re-identification},
      Author = {Chenyang Yu, Xuehu Liu, Yingquan Wang, Pingping Zhang, Huchuan Lu},
      Volume={38},
      Number={7},
      Pages = {6764-6772},
      Year = {2024},
      booktitle= = {AAAI}
}

:book: LICENSE

TF_CLIP is released under the MIT License.