arcee-ai / mergekit

Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
4.88k stars 446 forks source link

[question] `task_arithmetic` simple question #438

Closed eunbin079 closed 1 month ago

eunbin079 commented 1 month ago

hello? If I want to make chat vector( WizardLM/WizardLM-13B-V1.2 - garage-bAInd/Platypus2-13B) and apply to base model (psmathur/orca_mini_v3_13b) how can I make config?

I think I can use task_arithmetic but I don't know how to make config file.. please please help me !!

cg123 commented 1 month ago

You can do this:

merge_method: task_arithmetic
base_model: garage-bAInd/Platypus2-13B
models:
  - model: psmathur/orca_mini_v3_13b
    parameters:
      weight: 1.0
  - model: WizardLM/WizardLM-13B-V1.2
    parameters:
      weight: 0.3 # alpha

That will give you garage-bAInd/Platypus2-13B + (psmathur/orca_mini_v3_13b - garage-bAInd/Platypus2-13B) * 1 + (WizardLM/WizardLM-13B-V1.2 - garage-bAInd/Platypus2-13B) * alpha == psmathur/orca_mini_v3_13b + (WizardLM/WizardLM-13B-V1.2 - garage-bAInd/Platypus2-13B) * alpha.

eunbin079 commented 1 month ago

@cg123 You are a genius. Thank you so much! 🥹