harveyhuang18 / EMR_Merging

[NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging
27 stars 3 forks source link

EMR-Merging

This is the official implementation of our NeurIPS 2024 Spotlight Paper: EMR-Merging: Tuning-Free High-Performance Model Merging (arxiv). We realize tuning-free and high-performance model merging.

We provide the code for merging ViT models, language models (including RoBERTa and GPT-2), and BEiT-3 models.

Method Framework: In the (a) Merging Procedure, we merge task-specific vectors into a unified task vector and lightweight task-specific modulators to modulate direction and amplitude. During the (b) Inference Procedure, we apply the corresponding mask and rescaler to the unified task vector to obtain a specific task vector. The process of (c)Task-specific Direction and Amplitude Modulation includes obtaining task-specific masks and scalers.

Citation

If you find this project helpful for you, feel free to cite our paper:

@article{huang2024emr,
  title={EMR-Merging: Tuning-Free High-Performance Model Merging},
  author={Huang, Chenyu and Ye, Peng and Chen, Tao and He, Tong and Yue, Xiangyu and Ouyang, Wanli},
  journal={arXiv preprint arXiv:2405.17461},
  year={2024}
}

Acknowledgement

Our implementation references the code below, thanks to them.

Star History

Star History Chart