54rt1n / ComfyUI-DareMerge

ComfyUI powertools for SD1.5 and SDXL model merging
GNU General Public License v3.0
67 stars 10 forks source link

median & quantile #3

Closed novaexe closed 9 months ago

novaexe commented 9 months ago

they can no longer be performed on the sum of 2 models and loaded?

novaexe commented 9 months ago

in addition im also very confused by the masked approach in general, as it now requires a base model to be created first? some example workflows would be fantastic.

54rt1n commented 9 months ago

The masking used to be hidden under the surface, and didn't work very well. The purpose is to be able to find 'significant' parameters by comparing it to a base model. For DARE-TIES merging, the mask is optional, but can be used to protect some parameters from being overwritten. All of the options that were moved to masking were only related to that operation.

novaexe commented 9 months ago

hmm 🤔 so there is now no longer a way to choose? as before it was quite clear image

54rt1n commented 9 months ago

https://github.com/54rt1n/ComfyUI-DareMerge/blob/master/examples/daremerge.png Does this help?

novaexe commented 9 months ago

mmmm kinda but i still dont get it :(

54rt1n commented 9 months ago

I added another picture. You can see how the original model image is changed a lot when you do a normal dare merge, but when you mask the original model it retains a lot more of it's original character because it protected the 'most trained' parameters in the original model by comparing it to the base model (SD 1.5).

https://github.com/54rt1n/ComfyUI-DareMerge/blob/master/examples/daremergepic.png

novaexe commented 9 months ago

yea i suppose i need to play around with it more, still cant merge though quantile between just 2 merges tho?

novaexe commented 9 months ago

or is that what the masked model is now for? so it requires a base?

54rt1n commented 9 months ago

I pushed an update and changed the wording around some. If you wanted to do just a quantile merge without doing the dare-ties sampling, you could take and create a mask from the two models you want to merge (so you can find all the changes above/below your quantile), and then just do a regular Masked Model merge. I think it makes more sense the more you play with it.

novaexe commented 9 months ago

tyvm for ur time an patience, mostly im just trying to wrap my head around where dare merge fits in? thought it would be a drop in replacement of sorts for add difference initially, turns out its way more complicated than that, as before i could merge 2 models with completely different deltas and achieve something similar, looks like that is no longer the case or my brain is having trouble understanding the masked sum of 2 parts then to target instead of the way it was before with c being the protected model and ab being a chosen difference of how much more of either you would want

novaexe commented 9 months ago

hmm 🤔 so there is now no longer a way to choose? as before it was quite clear image

as again this was significantly easier to understand than being forced into a repeat merge even if it was being done in the backend for you

novaexe commented 9 months ago

image as this is now what makes the most sense to me but by ur example that is incorrect

novaexe commented 9 months ago

image

54rt1n commented 9 months ago

The mask selects all of the parameters you want to include in the merge. Since we don't want to overwrite the biggest ones in the model we are merging in to; we mask our target model with 'below' at ~90%. That way those parameters we want to keep are excluded from the merge. At least, that's a good place to start.

54rt1n commented 9 months ago

The math behind the DARE-TIES is that you don't need to take all of the parameters in a merge, and if you only take some percentage of the likely best ones, the resulting model can have the full capabilities of both, instead of just capabilities 'in the middle' between two models. You cycle your seed until you find one that captured a part of the second model you were wanting to add.

novaexe commented 9 months ago

so in your example that is taking the knowledge of sakura with a 90% and injecting it into dreamshaper, that part i get, what i don't is why protect dreamshaper instead of sakura

54rt1n commented 9 months ago

It drops 90% of sakura's parameters and protects dreamshaper when I don't want to overwrite the parameters that are highly significant to the original model's output. By setting my layer weights to zero, it takes 100% if the value of the second model for the small number of parameters that are copied, which can produce some amazing effects.

novaexe commented 9 months ago

oooooo okay i think i understand now! right i was kinda on the right track just got confused by the reverse order compared to an add difference merge A*+(b-c)

novaexe commented 9 months ago

in this way i guess it would be A+protect(c)+90%drop of b

novaexe commented 9 months ago

or something like that tyvm again! xD very interesting stuff nd just a little complicated but ill take the time to learn it :D