Open authurlord opened 9 months ago
@authurlord I'm not the author of this repo and accidently see the repo. For LoRA weights merging, lorahub is a candidate method (https://github.com/sail-sg/lorahub).
+1 for this feature request. It would open up some interesting merge possibilities if we could leverage mergekit's advanced methods for merging LoRA adapters.
Hello all, I have some questions regarding LoRA and task vectors. Typically, the task vector is defined as:
$$ \theta{\text{sft}} - \theta{\text{pt}} $$
where $\theta{\text{sft}}$ is the fine-tuned model and $\theta{\text{pt}}$ is the pre-trained model.
For LoRA, the formulation is:
$$ H{\text{hidden}} = X{\text{input}} \cdot W{\text{frozen}} + X{\text{input}} \cdot W_a \cdot W_b $$
where $W_a \cdot W_b$ represents the low-rank adaptation weights.
My question is: isn't the LoRA module, $W_a \cdot W_b$, naturally equivalent to the task vector? If so, then the methods typically used in task vector merging (such as TIER-Merging) should still be applicable to the LoRA weights.
Please correct me if my understanding is wrong, thank you guys!
Many thanks for the great work. Can mergekit merge multiple LoRA checkpoints by provided weights, which are trained on the same base model, and output a merged LoRA weights? Peft has an add_weighted_adapter(https://huggingface.co/docs/peft/v0.7.1/en/package_reference/lora#peft.LoraModel.add_weighted_adapter) method to merge multiple LoRA, but only have 'linear/concat/svd' methods.
Besides, for the mixtral_merge method, is the gated network activated by checking pos and neg prompts provided with each experts?