Leeroo-AI / mergoo

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
https://www.leeroo.com/
GNU Lesser General Public License v3.0
360 stars 19 forks source link
artificial-intelligence fine-tuning generative-ai large-language-models llm lora merge mixture-of-adapters mixture-of-experts multi-model nlp open-source transformers

Mergoo Leeroo logo

made-with-python License: LPGLv3.0 Version

mergoo is a library for easily merging multiple LLM experts, and efficiently train the merged LLM. With mergoo, you can efficiently integrate the knowledge of different generic or domain-based LLM experts.

πŸš€ Features

If you like the project, consider leaving a ⭐️

Installation

Install by pip:

pip install mergoo

Install latest unstable version on Github:

pip install git+https://github.com/Leeroo-AI/mergoo

Install it from the source:

git clone https://github.com/Leeroo-AI/mergoo
cd mergoo
pip install -e .

Quick Start

Configuration Setup

Specify the config for merging:

Fully Fine-tuned Experts

This is a sample config when merging fully fine-tuned LLM experts.

config = {
    "model_type": "mistral",
    "num_experts_per_tok": 2,
    "experts": [
        {"expert_name": "base_expert", "model_id": "mistralai/Mistral-7B-v0.1"},
        {"expert_name": "expert_1", "model_id": "meta-math/MetaMath-Mistral-7B"},
        {"expert_name": "expert_2", "model_id": "ajibawa-2023/Code-Mistral-7B"}
    ],
    "router_layers": ["gate_proj", "up_proj", "down_proj"]
}

For the above example, we merged math and code mistral-based experts. Please refer to this notebook for further details!

Mixture of Adapters (MoE on LoRA)

This is a sample config when merging LoRA fine-tuned LLM experts. mergoo builds a routing layer on top of LoRAs, resulting in a mixture of adapters.

config = {
    "model_type": "mistral",
    "num_experts_per_tok": 2,
    "base_model": "mistralai/Mistral-7B-v0.1",
    "experts": [
        {"expert_name": "adapter_1", "model_id": "predibase/customer_support"},
        {"expert_name": "adapter_2", "model_id": "predibase/customer_support_accounts"},
        {"expert_name": "adapter_3", "model_id": "predibase/customer_support_orders"},
        {"expert_name": "adapter_4", "model_id": "predibase/customer_support_payments"}
    ],
}

The expert_name starts with adapter instead of expert. Please refer to this notebook for further details!

Merge Experts

Following the config setup, mergoo creates the merged LLM as:

import torch
from mergoo.compose_experts import ComposeExperts

# create checkpoint
model_id = "data/mistral_lora_moe"
expertmerger = ComposeExperts(config, torch_dtype=torch.float16)
expertmerger.compose()
expertmerger.save_checkpoint(model_id)

Load / Finetune Merged Expert

Now, you can easily train the merged LLM with Hugging Face Trainer:

from transformers import Trainer
from mergoo.models.modeling_mistral import MistralForCausalLM

model = MistralForCausalLM.from_pretrained("data/mistral_lora_moe") 
# NOTE: 'gate' / router layers are untrained hence weight loading warning would appeare for them

trainer = Trainer( ... )
trainer.train()

πŸ“š Learn More:

After finishing the Quick Start guide, you can explore the tutorials below to further familiarize yourself with mergoo.

Notebook Details
MoE with fully fine-tuned LLM experts Build a unifined Mixture-of-Experts model with fully fine-tuned experts. Inspired by BTX Research (Meta AI).
MoE with LoRA fine-tuned experts Build a Mixture of Adaptes expert. Inspired by xlora | Mixture-of-LoRAs | MoLE | PHATGOOSE | MoELoRA
Hugging Face Blog Deep dive into research details behind the merging methods of mergoo library
LLaMa3-based Experts Build your own MoE-style LLM experts by integrating LLaMa3-based domain experts
Phi3-based Experts Create MoE-style LLM architecture by merging Phi3-based fine-tuned models

Mergoo Roadmap and Contributing

As an open-source library in a fast evolving domain, we welcome contributions, whether it is introducing new features, enhancing infrastructure, or improving documentation.

Here is mergoo roadmap:

Feel free to suggest new features and/or contribute to mergoo roadmap!

Join our community!

πŸš€ We love to here your feedback, please join Leeroo community:

Have a question not listed here? Open a GitHub Issue or send us an email!