haihuangcode / CMG

The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)
167 stars 2 forks source link
cross-modal cross-modal-generalization multimodal pretrained-models

Achieving Cross Modal Generalization with Multimodal Unified Representation, NeurIPS 2023

model

This is the Pytorch implementation of our paper:

Achieving Cross Modal Generalization with Multimodal Unified Representation

Yan Xia, Hai Huang, Jieming Zhu, Zhou Zhao

In NeurIPS 2023


đź“ťRequirements and Installation

git clone https://github.com/haihuangcode/CMG
cd CMG
# You don't actually have to install all the libraries in the txt file, you can choose to install them as needed.
# It is recommended to use Python 3.7, as some libraries used do not support higher versions of Python.
conda create -n your_env_name python=3.7
pip install -r requirements.txt

🎓Cite

If you find this work useful, please consider citing it.

@article{xia2024achieving,
  title={Achieving Cross Modal Generalization with Multimodal Unified Representation},
  author={Xia, Yan and Huang, Hai and Zhu, Jieming and Zhao, Zhou},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

✏Model Checkpoints And Date Feature

Baidu Disk (pwd: 1234)

✏Directory

CMG
├── checkpoint
├── cnt.pkl
├── code
├── data
├── figs
├── paper
├── README.md
└── requirements.txt

✏Note

đź‘ŤAcknowledgments

Our code is based on AVE, AVVP, PSP, CPSP, VGGSOUND, AVS.