WentaoTan / MLLM4Text-ReID

Code for Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID (CVPR 2024)
18 stars 0 forks source link

Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID (CVPR 2024)

Requirements

pytorch 1.9.0
torchvision 0.10.0
prettytable
easydict

1、Construct LUPerson-MLLM

2、Prepare Downstream Datasets

Download the CUHK-PEDES dataset from here, ICFG-PEDES dataset from here and RSTPReid dataset form here.

3、Pretrain Model (direct transfer setting)

To pretrain your model, you can simply run sh run.sh. After the model training is completed, it will provide the performance of direct transfer setting.

4、Fine-tune the Pretrained Model on Downstream Datasets (fine-tune setting)

We release the Pretrain Model Checkpoints here. \ To fine-tune your model, you can simply run sh finetune.sh --finetune checkpoint.pth. After the model training is completed, it will provide the performance of fine-tune setting.

Acknowledgments

This repo borrows partially from IRRA.

Citation

@article{tan2024harnessing,
  title={Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID},
  author={Wentao Tan, Changxing Ding, Jiayu Jiang, Fei Wang, Yibing Zhan, Dapeng Tao},
  journal={CVPR},
  year={2024},
}

Contact

Email: ftwentaotan@mail.scut.edu.cn or 731584671@qq.com

如果可以当然还是希望用中文contact我啦!