[Feature] Mixed Preference Optimization for 76B internVL

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

https://internvl.readthedocs.io/en/latest/

MIT License

6.12k stars 477 forks source link

[Feature] Mixed Preference Optimization for 76B internVL #704

Open CCRss opened 1 week ago

CCRss commented 1 week ago

Motivation

Is it possible to apply Mixed Preference Optimization for 76B internVL. Similar to 8B but for 76B?

Related resources

No response

Additional context

No response

Weiyun1025 commented 6 days ago

Our current MPO codebase is implemented based on HuggingFace's TRL library, which is not very well-suited for supporting large models. We plan to enable training for the 76B model after migrating to a more efficient codebase.

czczup commented 3 days ago

We are going to support liger kernel in the near future, maybe after using liger kernel to save GPU memory, we can use this codebase to train MPO for 76B model.