Closed ljleb closed 11 months ago
thanks for your information!
Although it's ultimately up to you, I disagree with closing this issue. I think there are other useful features in sd-meh, for example weights clipping.
Something you may want to consider is using the library directly. If the extension uses the library instead of re-implementing everything, you would only have to bump the version of sd-meh when new useful merge techniques are found. But again, up to you.
Although it's ultimately up to you, I disagree with closing this issue. I think there are other useful features in sd-meh, for example weights clipping.
Something you may want to consider is using the library directly. If the extension uses the library instead of re-implementing everything, you would only have to bump the version of sd-meh when new useful merge techniques are found. But again, up to you.
model-mixer can merge block levels and internally even key levels, which explains its high speed. sd-meh does not support block-level merging at all. that is why model-mixer simply can't use it without modification.
please tell me if I'm wrong.
what does clip_weights do? it's a very simple algorithm to reduce over-fitting results. https://github.com/s1dlx/meh/blob/main/sd_meh/merge.py#L440C1-L445C76
def clip_weights_key(thetas, merged_weights, key):
t0 = thetas["model_a"][key]
t1 = thetas["model_b"][key]
maximums = torch.maximum(t0, t1)
minimums = torch.minimum(t0, t1)
return torch.minimum(torch.maximum(merged_weights, minimums), maximums)
as you can see this procedure is a key level, and could be applied to model-mixer easily.
https://github.com/s1dlx/meh is a small library that has a good number of merge methods, some of which are not in supermerger. One very important feature is "weights clipping", which allows to merge models using add difference alpha=1.0 with limited distortion by clipping the weights to the original models A and B. There's also rebasin, which makes it possible to reduce the loss when merging using weighted sum.
Note that the library does not yet support SDXL in the main branch.
I did contribute to it a little bit, which is why I know about this library. Just wanted to mention this in case you were considering adding more merge options.