arcee-ai / mergekit

Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
4.57k stars 406 forks source link

How to merge only at q_proj layers with SLERP? #327

Closed yiyiwwang closed 4 months ago

yiyiwwang commented 4 months ago

Could you please tell me how to merge two models at q_proj layers with the method SLERP, and the other layers keep the same with the base model? Thanks very much. Is it correct in the following config?

models:

yiyiwwang commented 4 months ago

Another question is that what's the meaning of parameters t when it is an array like [0, 0.5, 0.3, 0.7, 1] not a value? Is there any experience to set a suitable t?

yiyiwwang commented 4 months ago

I think the first is probably no problem. And the second question is answered in https://github.com/arcee-ai/mergekit/issues/5.