arcee-ai / mergekit

Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
4.57k stars 406 forks source link

Existing Mergekit algorithms to merge VLM with LLM? #310

Open ChaseKolozsy opened 4 months ago

ChaseKolozsy commented 4 months ago

The following article discusses merging a Japanese VLM with an English LLM.

In the article it states that they did the following:

  1. Evolving the Weights for Mixing Parameters in the Parameter Space (PS): This step involves adjusting the weights of the parameters in the models to optimize performance.
  2. Evolving Layer Permutations in the Data Flow Space (DFS): This involves rearranging the layers of the models in a way that optimizes the flow of data and potentially enhances model performance.
  3. Integrated Strategy that Combines Both PS and DFS Merging: This final step merges the strategies from both the parameter space and the data flow space. This is not a simple copying or stitching of layers or parameters but involves a blending of weights and configurations, akin to mixing colors (as illustrated by the transition from red and blue to purple in the diagram)

Screenshot from 2024-05-08 11-35-15

Evolutionary Optimization of Model Merging Recipes

Does Mergekit have similar algorithms to merge say BAAI/Bunny-Llama-3-8B-V with say meta-llama/Meta-Llama-3-70B-Instruct

choprahetarth commented 3 months ago

I think we have the evolutionary merge included in mergkit now.