Closed SwiftIllusion closed 10 months ago
Thanks for the suggestion, I've been interested in lossless merging, but the complexity of the code has kept me from getting into it. I will think about what you suggested.
As for the Save button, it used to exist in the past, but has been removed. This is because merging is done again even if the Save button is pressed. The loaded model is in fp16 format and needs to be merged back.
No worries, I'm especially glad then I was able to share here how I ended up implementing it and how another method could be added too :) . Good luck whenever you might get to it.
Ahh if that's the case it makes sense, with a button you would expect it to be able to save immediately but if there's that limitation and it needs to merge again anyway that button would be confusing, thanks for clarifying.
Added features. Thanks!
You can actually get a lot of the performance back when using the filters by offloading it to the gpu by using CuPy. It shouldn't be too difficult to implement.
Very smooth implementation, thank you for the great work :)
An error however I found in the latest update, when trying to save a file specifically as a safetesnor (with both normal calculation and cosine):
Traceback (most recent call last):
File "D:\AIdev\AIdiffusion\diffusion\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 337, in run_predict
output = await app.get_blocks().process_api(
File "D:\AIdev\AIdiffusion\diffusion\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1015, in process_api
result = await self.call_function(
File "D:\AIdev\AIdiffusion\diffusion\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 833, in call_function
prediction = await anyio.to_thread.run_sync(
File "D:\AIdev\AIdiffusion\diffusion\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "D:\AIdev\AIdiffusion\diffusion\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "D:\AIdev\AIdiffusion\diffusion\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
result = context.run(func, *args)
File "D:\AIdev\AIdiffusion\diffusion\stable-diffusion-webui\extensions\sd-webui-supermerger\scripts\mergers\mergers.py", line 72, in smergegen
result = savemodel(theta_0,currentmodel,custom_name,save_sets,model_a,metadata) if save else "Merged model loaded:"+currentmodel
File "D:\AIdev\AIdiffusion\diffusion\stable-diffusion-webui\extensions\sd-webui-supermerger\scripts\mergers\model_util.py", line 700, in savemodel
safetensors.torch.save_file(state_dict, fname, metadata=metadata)
File "D:\AIdev\AIdiffusion\diffusion\stable-diffusion-webui\venv\lib\site-packages\safetensors\torch.py", line 71, in save_file
serialize_file(_flatten(tensors), filename, metadata=metadata)
TypeError: argument 'metadata': 'dict' object cannot be converted to 'PyString'
Also with how smooth this latest implementation, I was able to add another version of the Cosine merge, which mixes the weights separately before calculating cosine, and you can see in the demonstrated comparison below how this can result in favoring structures of A and details of B/and the other way round from modelA vs modelB (calculated in a different sequence, it's not a result you would get by just swapping the A/B model around), so I added and adjusted them as cosineA and cosineB calculation modes.
I've also added in as a calculation mode smoothAdd which is the smoother filtered add difference method from the original post that was missed here. And thanks to the implementation of the first Cosine I was able to implement them 'properly' this time instead of just replacing existing things.
In supermerger.py I replace
calcmode = gr.Radio(label = "Calcutation Mode",choices = ["normal", "cosine"], value = "normal")
with
calcmode = gr.Radio(label = "Calcutation Mode",choices = ["normal", "cosineA", "cosineB", "smoothAdd"], value = "normal")
Then in mergers.py I replace
elif calcmode == "cosine":
# skip VAE model parameters to get better results
if "first_stage_model" in key: continue
if "model" in key and key in theta_0:
simab = sim(theta_0[key].to(torch.float32), theta_1[key].to(torch.float32))
dot_product = torch.dot(theta_0[key].view(-1).to(torch.float32), theta_1[key].view(-1).to(torch.float32))
magnitude_similarity = dot_product / (torch.norm(theta_0[key].to(torch.float32)) * torch.norm(theta_1[key].to(torch.float32)))
combined_similarity = (simab + magnitude_similarity) / 2.0
k = (combined_similarity - sims.min()) / (sims.max() - sims.min())
k = k - current_alpha
k = k.clip(min=.0,max=1.)
caster(f"model A[{key}] + {1-k} + * (model B)[{key}]*{k}",hear)
theta_0[key] = theta_0[key] * (1 - k) + theta_1[key] * k
with
elif calcmode == "cosineA": #favors modelA's structure with details from B
# skip VAE model parameters to get better results
if "first_stage_model" in key: continue
if "model" in key and key in theta_0:
# Normalize the vectors before merging
theta_0_norm = nn.functional.normalize(theta_0[key].to(torch.float32), p=2, dim=0)
theta_1_norm = nn.functional.normalize(theta_1[key].to(torch.float32), p=2, dim=0)
simab = sim(theta_0_norm, theta_1_norm)
dot_product = torch.dot(theta_0_norm.view(-1), theta_1_norm.view(-1))
magnitude_similarity = dot_product / (torch.norm(theta_0_norm) * torch.norm(theta_1_norm))
combined_similarity = (simab + magnitude_similarity) / 2.0
k = (combined_similarity - sims.min()) / (sims.max() - sims.min())
k = k - current_alpha
k = k.clip(min=.0,max=1.)
caster(f"model A[{key}] + {1-k} + * (model B)[{key}]*{k}",hear)
theta_0[key] = theta_1[key] * (1 - k) + theta_0[key] * k
elif calcmode == "cosineB": #favors modelB's structure with details from A
# skip VAE model parameters to get better results
if "first_stage_model" in key: continue
if "model" in key and key in theta_0:
simab = sim(theta_0[key].to(torch.float32), theta_1[key].to(torch.float32))
dot_product = torch.dot(theta_0[key].view(-1).to(torch.float32), theta_1[key].view(-1).to(torch.float32))
magnitude_similarity = dot_product / (torch.norm(theta_0[key].to(torch.float32)) * torch.norm(theta_1[key].to(torch.float32)))
combined_similarity = (simab + magnitude_similarity) / 2.0
k = (combined_similarity - sims.min()) / (sims.max() - sims.min())
k = k - current_alpha
k = k.clip(min=.0,max=1.)
caster(f"model A[{key}] + {1-k} + * (model B)[{key}]*{k}",hear)
theta_0[key] = theta_1[key] * (1 - k) + theta_0[key] * k
elif calcmode == "smoothAdd":
caster(f"model A[{key}] + {current_alpha} + * (model B - model C)[{key}]", hear)
# Apply median filter to the weight differences
filtered_diff = scipy.ndimage.median_filter(theta_1[key].to(torch.float32).cpu().numpy(), size=3)
# Apply Gaussian filter to the filtered differences
filtered_diff = scipy.ndimage.gaussian_filter(filtered_diff, sigma=1)
theta_1[key] = torch.tensor(filtered_diff)
# Add the filtered differences to the original weights
theta_0[key] = theta_0[key] + current_alpha * theta_1[key]
and
if calcmode =="cosine":
if stopmerge: return "STOPPED", *non4
sim = torch.nn.CosineSimilarity(dim=0)
sims = np.array([], dtype=np.float64)
for key in (tqdm(theta_0.keys(), desc="Stage 0/2")):
# skip VAE model parameters to get better results
if "first_stage_model" in key: continue
if "model" in key and key in theta_1:
simab = sim(theta_0[key].to(torch.float32), theta_1[key].to(torch.float32))
dot_product = torch.dot(theta_0[key].view(-1).to(torch.float32), theta_1[key].view(-1).to(torch.float32))
magnitude_similarity = dot_product / (torch.norm(theta_0[key].to(torch.float32)) * torch.norm(theta_1[key].to(torch.float32)))
combined_similarity = (simab + magnitude_similarity) / 2.0
sims = np.append(sims, combined_similarity.numpy())
sims = sims[~np.isnan(sims)]
sims = np.delete(sims, np.where(sims < np.percentile(sims, 1, method='midpoint')))
sims = np.delete(sims, np.where(sims > np.percentile(sims, 99, method='midpoint')))
with
if calcmode =="cosineA": #favors modelA's structure with details from B
if stopmerge: return "STOPPED", *non4
sim = torch.nn.CosineSimilarity(dim=0)
sims = np.array([], dtype=np.float64)
for key in (tqdm(theta_0.keys(), desc="Stage 0/2")):
# skip VAE model parameters to get better results
if "first_stage_model" in key: continue
if "model" in key and key in theta_1:
theta_0_norm = nn.functional.normalize(theta_0[key].to(torch.float32), p=2, dim=0)
theta_1_norm = nn.functional.normalize(theta_1[key].to(torch.float32), p=2, dim=0)
simab = sim(theta_0_norm, theta_1_norm)
sims = np.append(sims,simab.numpy())
sims = sims[~np.isnan(sims)]
sims = np.delete(sims, np.where(sims<np.percentile(sims, 1 ,method = 'midpoint')))
sims = np.delete(sims, np.where(sims>np.percentile(sims, 99 ,method = 'midpoint')))
if calcmode =="cosineB": #favors modelB's structure with details from A
if stopmerge: return "STOPPED", *non4
sim = torch.nn.CosineSimilarity(dim=0)
sims = np.array([], dtype=np.float64)
for key in (tqdm(theta_0.keys(), desc="Stage 0/2")):
# skip VAE model parameters to get better results
if "first_stage_model" in key: continue
if "model" in key and key in theta_1:
simab = sim(theta_0[key].to(torch.float32), theta_1[key].to(torch.float32))
dot_product = torch.dot(theta_0[key].view(-1).to(torch.float32), theta_1[key].view(-1).to(torch.float32))
magnitude_similarity = dot_product / (torch.norm(theta_0[key].to(torch.float32)) * torch.norm(theta_1[key].to(torch.float32)))
combined_similarity = (simab + magnitude_similarity) / 2.0
sims = np.append(sims, combined_similarity.numpy())
sims = sims[~np.isnan(sims)]
sims = np.delete(sims, np.where(sims < np.percentile(sims, 1, method='midpoint')))
sims = np.delete(sims, np.where(sims > np.percentile(sims, 99, method='midpoint')))
and I added these includes necessary
import torch.nn as nn
import scipy.ndimage
from scipy.ndimage.filters import median_filter as filter
@mariaWitch Thanks for the suggestion, however though I believe I got the code for that method, it looks to be restricted to CUDA devices and while trying to pip install it, it wouldn't work (couldn't find CUDA for some reason) and was trying to built it on my system itself which it couldn't, and so not sure myself how that would be properly implemented? Especially when just the necessary include would cause these problems.
Thanks for the suggestion, however though I believe I got the code for that method, it looks to be restricted to CUDA devices and while trying to pip install it, it wouldn't work (couldn't find CUDA for some reason) and was trying to built it on my system itself which it couldn't, and so not sure myself how that would be properly implemented? Especially when just the necessary include would cause these problems.
I actually properly implemented it into Bayesian Merger, which has a very similar code structure to Super Merger. You can check it out here:
github.com/mariaWitch/sd-webui-bayesian-merger/blob/double-diff-cosine/sd_webui_bayesian_merger/merger.py#L250-L263
But essentially I had to convert the tensor that gets passed to SciPy (In this case CuPyx) to a dl pack and then use CuPy to convert it into a CuPy Array, and then pass that to the filters. Once that was done, I was able to convert it back into a DLpack, and then convert it back into a tensor with from_dlpack. The reason why we have to convert the tensor to a DL pack is that this is the only supported way doing Zero-copy transfer of a cpu tensor into a CuPy array (which is on the gpu, as CuPyX does not support standard numpy arrays). By doing it this way, we avoid costly memory transfers between system ram and VRAM that would otherwise decrease performance. It should be noted that CuPy and CuPyX (part of the same package) both have nearly identitcal functions to their Non Cuda Counterparts, so much so that I literally just recasted cupyx.scipy as scipy when I imported.
from torch.utils.dlpack import to_dlpack
from torch.utils.dlpack import from_dlpack
import cupy as cp
import cupyx.scipy as scipy
from cupyx.scipy.ndimage._filters import median_filter as filter
These would be the imports that you would bring in if CuPy was installed, (these would be imported in place of scipy)
As for your installation issues, I have had no such bad luck. But, I think CuPy has experiemental support for ROCM as well. Either way it can exist as something that can work if the script can import it, otherwise it could just fail over. But I saw a 10x speed reduction just from using CuPy instead of SciPy for the filters, so I think it is definitely worth trying to get working.
Also, could you elaborate a little on what you mean by the "structure" of the model and the "details" of a model in the context that you used them in, it seems a bit abstract, and could mean a lot of different things.
@SwiftIllusion Thanks a lot! I I will be implementing the method you described in the next update. I need your help. I am also planning to implement a new calculation method and will create a new README about the calculation method. Could you please write an explanation about the calculation method you have introduced? https://github.com/hako-mikan/sd-webui-supermerger/blob/ver10/calcmode.md
@mariaWitch Thanks for your advice. Certainly the faster the calculation the better, so it would be good if we could implement the method you have introduced. On the other hand, methods that depend on the environment cause many problems. Especially users who use google colab seem to have a lot of import problems. Thus, I would consider it to work with or without installation.
No worries :) glad to hear.
I don't have the technical wizardy or verbal expertise as some, but I've tried with my own observations in its development/output alongside chatgpt to help provide some more guidance/details below, as I know what it's like to see new tech and have no idea what it's doing/how to take advantage of it. Hope it helps.
Normal calculation method. Can be used in all modes.
The comparison of two models is performed using cosine similarity, centered on the set ratio, and is calculated to eliminate loss due to merging. See below for further details. https://github.com/hako-mikan/sd-webui-supermerger/issues/33 https://github.com/recoilme/losslessmix
The original simple weight mode is the most basic method and works by linearly interpolating between the two models based on a given weight alpha. At alpha = 0, the output is the first model (model A), and at alpha = 1, the output is the second model (model B). Any other value of alpha results in a weighted average of the two models.
charming girl mid-shot. scenery-beautiful majestic
One key advantage of the cosine methods over the original simple weight mode is that they take into account the structural similarity between the two models, which can lead to better results when the two models are similar but not identical. Another advantage of the cosine methods is that they can help prevent overfitting and improve generalization by limiting the amount of detail from one model that is incorporated into the other.
In the case of CosineA, we normalize the vectors of the first model (model A) before merging, so the resulting merged model will favor the structure of the first model while incorporating details from the second model. This is because we are essentially aligning the direction of the first model's vectors with the direction of the corresponding vectors in the second model.
Detail-wise for example note how above and below, in all cases there's more blur preserved for the background compared to foreground, instead of the linear difference in the original merge.
On the other hand, in CosineB, we normalize the vectors of the second model (model B) before merging, so the resulting merged model will favor the structure of the second model while incorporating details from the first model. This is because we are aligning the direction of the second model's vectors with the direction of the corresponding vectors in the first model.
In summary, the choice between CosineA and CosineB depends on which model's structure you want to prioritize in the resulting merged model. If you want to prioritize the structure of the first model, use CosineA. If you want to prioritize the structure of the second model, use CosineB.
Note also how the second model is more the 'reference point' for the merging looking at Alpha 1 compared to the changes at 0, so the order of models can also change the end result to look for your desired output.
A method of add difference that mixes the benefits of Median and Gaussian filters, to add model differences in a smoother way trying to avoid the negative 'burning' effect that can be seen when adding too many models this way. This also achieves more than just simply adding the difference at a lower value.
The starting point for reference
Adding a collection of models on top of it, each with a value of 1
The burn here is very obvious
Adding a collection of models on top of it, each with a value of 0.5
Still not an outcome I would accept, especially you can see with the bird
Reduces noise in the difference by replacing each value with the median of the neighboring values.
Preserves edges and structures in the difference, which is helpful when you want to transfer the learning related to object shapes and boundaries.
Non-linear filtering, which means it can better preserve the important features in the difference while reducing noise.
Smooths the difference by applying a Gaussian kernel, which reduces high-frequency noise and retains the low-frequency components.
The level of smoothing can be controlled by the sigma parameter, allowing you to experiment with different levels of smoothing.
Linear filtering, which means it can better preserve the global structure in the difference while reducing noise.
The final result when instead using the combination of Median and Gaussian filters Note also compared with either the Median/Guassin filters individually how you can see the top left of the mans hair in the top right image doesn't get 'stuck' when combining them here, achieving the best result overall
TIP Sometimes you may want to use this smooth Add difference as an alternative to the regular, even without the risk of burning. In these cases you could increase the Alpha up to 2, as smooth Add at 1 is a lower impact change individually than regular Add, but this of course depends on your desired outcome.
@mariaWitch Regrettably without being able to install the requirements for the include of your method, I've been unable to see that here. I also spent hours trying to implement other methods/performance improvements to the filters within the existing scope, but the closest I got was a different method for one of the two filters that resulted in a completely different/wrong output, the rest of the time was spent with errors, so I've had to consider that beyond the scope of what I can achieve. At least though with also seeing the additional prompt about the readme, I've now outlined everything better above to hopefully help you and others better understand what I was referring to in the cosine merge methods and how to take advantage of it all.
So Structure refers to the background and pose, and details refer to the actual character details on the subject, that makes it a lot more clear.
@SwiftIllusion Great! Thanks a lot! Your explanation with the poses is very clear.
Updated
I just checked the A/B cosine and the results are impressive. Thanks a lot!
@recoilme Awesome, I'm really happy to hear that :), thank you very much for the original inspiration. That result is amazing :D.
theta_0[key] = theta_1[key] (1 - k) + theta_0[key] k
@SwiftIllusion Why was this changed from theta_0[key]= theta_0[key] (1-k) + theta_1[key] k to the line above? This seems a bit backwards now.
@mariaWitch This was to fix the fact it was actually previously merging backwards (if you put the weight at 0.75, it would have been 0.75 to modelA instead of modelB). Now, as per the examples in the guide which was made after this fix, it has the output correctly going from 0 A to 1 B.
Thank you! That is clear enough for me.
@hako-mikan After getting the previous merge methods working, I was left with a theory of a theory to tackle a problem I wasn't sure would be possible, eventually I discovered it was possible. I couldn't explain the technicality of it, and GPT never knew what I was trying to do, but I tested it and have since been working with it after discovering/confirming it properly worked, to see just how far I could push its potential myself and so I could work out a guide/tips for it too to give people a full head start on how to work with it and what to try avoid. I don't know why it works, but it works, and significantly expands the possibilities of merges and models. It's something I hope can positively evolve how people develop merged models and share trained models.
I hope you can and would appreciate you adding this whenever you get the opportunity, I've provided the code at the end of this post like previously, adding it into the latest version (Commits on Jun 5, 2023) as a new choice for Add Difference.
This method at its simplest, can be thought of as a 'super Lora' for permanent merges, it no longer adds the calculated difference between (B)-(C) models to model (A), now it 'trains' that difference as if it was finetuning it relative to model (A).
In supermerger.py replace
calcmode = gr.Radio(label = "Calcutation Mode",choices = ["normal", "cosineA", "cosineB", "smoothAdd","tensor"], value = "normal")
with
calcmode = gr.Radio(label = "Calcutation Mode",choices = ["normal", "cosineA", "cosineB", "trainDifference", "smoothAdd","tensor"], value = "normal")
Then in mergers.py replace
if MODES[1] in mode:#Add
if stopmerge: return "STOPPED", *non4
theta_2 = load_model_weights_m(model_c,False,False,save).copy()
for key in tqdm(theta_1.keys()):
if 'model' in key:
if key in theta_2:
t2 = theta_2.get(key, torch.zeros_like(theta_1[key]))
theta_1[key] = theta_1[key]- t2
else:
theta_1[key] = torch.zeros_like(theta_1[key])
del theta_2
with
if MODES[1] in mode:#Add
if stopmerge: return "STOPPED", *non4
if calcmode == "trainDifference":
theta_2 = load_model_weights_m(model_c,True,False,save).copy()
else:
theta_2 = load_model_weights_m(model_c,False,False,save).copy()
for key in tqdm(theta_1.keys()):
if 'model' in key:
if key in theta_2:
t2 = theta_2.get(key, torch.zeros_like(theta_1[key]))
theta_1[key] = theta_1[key]- t2
else:
theta_1[key] = torch.zeros_like(theta_1[key])
del theta_2
and replace
if MODES[2] in mode or MODES[3] in mode:#Tripe or Twice
theta_2 = load_model_weights_m(model_c,False,False,save).copy()
else:
theta_2 = {}
with
if MODES[2] in mode or MODES[3] in mode:#Tripe or Twice
theta_2 = load_model_weights_m(model_c,False,False,save).copy()
else:
if calcmode != "trainDifference":
theta_2 = {}
and replace
if usebeta and (not key in theta_2) and (not theta_2 == {}) :
continue
with
if calcmode == "trainDifference":
if key not in theta_2:
continue
else:
if usebeta and (not key in theta_2) and (not theta_2 == {}) :
continue
and between "cosineB" and "smoothAdd" methods, add (note multiplying current_alpha by 1.8 is intentional, I don't understand the maths, but from testing that makes the 'training' amount equivelant to 1:1 when current_alpha is set to 1)
elif calcmode == "trainDifference":
# Check if theta_1[key] is equal to theta_2[key]
if torch.allclose(theta_1[key].float(), theta_2[key].float(), rtol=0, atol=0):
theta_2[key] = theta_0[key]
continue
diff_AB = theta_1[key].float() - theta_2[key].float()
distance_A0 = torch.abs(theta_1[key].float() - theta_2[key].float())
distance_A1 = torch.abs(theta_1[key].float() - theta_0[key].float())
sum_distances = distance_A0 + distance_A1
scale = torch.where(sum_distances != 0, distance_A1 / sum_distances, torch.tensor(0.).float())
sign_scale = torch.sign(theta_1[key].float() - theta_2[key].float())
scale = sign_scale * torch.abs(scale)
new_diff = scale * torch.abs(diff_AB)
theta_0[key] = theta_0[key] + (new_diff * (current_alpha*1.8))
and after the last "del theta_1"add
if calcmode == "trainDifference":
del theta_2
Added trainDifference. Thanks!!
No worries, thank you very much for your work on this/implementing it :) .
@SwiftIllusion @mariaWitch I just want to say that @sverfier8807 has implemented multi-threading for smoothAdd and now it works much faster.
@SwiftIllusion @hako-mikan maybe here instead of 1.8 multiplier should be 2? //H - harmonic mean
@StAlKeR7779 Sorry as I don't know the value of the math you're displaying and I appreciate the thought to improve it further, but I did many tests across different merges (e.g. including merging models trained on SDv1.4 to SDv1.5 or SDv1.5 to Practically the same model, and different Lora comparisons which you can see in the guide how it correlates in strength). Anything beyond 1.8 started to 'burn/over-train' in a way that appeared greater than the original (I tested from 1 to 2). Even 1.9 appeared too much which surprised me at the time as I was expecting 2 to be the most natural value if it required more than 1, but 1.8 was the most representative and I've used it greatly since then with that value.
While trying to find methods to improve models, one of the things to look into was merging and hopefully the below discoveries are valuable in helping improve/provide additional options for merging.
Sum merging
Initially for this I started with inspiration after finding https://github.com/recoilme/losslessmix. However through ChatGPT (regardless of your feelings about utilizing this/concerns with accuracy the below outputs hopefully show the interactions held value), I found that to just be working with the vector orientations. I expanded that to also take into account the magnitude and combined the results for the best merging outputs in my comparisons. One of the difficulties with sum merging is you can find you lose some things through the merge, below is a comparison with different prompts and 2 seeds between regular merge and the new method. You can see below the improved details/depth especially in the jewelry, and in the top girls background, the top birds twig also connects better besides the extra details, and improved hands for the guy on the right.
Add difference merging
One of the things that has been most difficult with add merging, is the rate in which consecutive merging in attempts to gain more learning can lead to burnt/overexposed-like colors and edges. To demonstrate this and what the new method achieves, these are comparisons starting with seekArtMega20, then adding dreamlike and openjourneyV2 (with sdv1.5 as model C for the difference).
The code
This may be a bit messy for implementation, I just replaced the existing methods for merging/adding (I don't have the ability or experience to make this into additional options/a pull request), but here is what I used.
Relevant changes
Sum merging
Replacing "theta_0[key] = (1 - current_alpha) theta_0[key] + current_alpha theta_1[key]" with
Add Difference merging Required 'pip install scipy' in the Automatic1111 directory for the filters
Add above "theta_0[key] = theta_0[key] + current_alpha * theta_1[key]"
The full script
(The above within the full context of the merger script as it was at the time I made the above changes)
from inspect import currentframe
mergedmodel=[] typesg = ["none","alpha","beta (if Triple or Twice is not selected,Twice automatically enable)","alpha and beta","seed", "mbw alpha","mbw beta","mbw alpha and beta", "model_A","model_B","model_C","pinpoint blocks (alpha or beta must be selected for another axis)","elemental","pinpoint element","effective elemental checker"] types = ["none","alpha","beta","alpha and beta","seed", "mbw alpha ","mbw beta","mbw alpha and beta", "model_A","model_B","model_C","pinpoint blocks","elemental","pinpoint element","effective"] modes=["Weight" ,"Add" ,"Triple","Twice"] sevemodes=["save model", "overwrite"]
type[0:aplha,1:beta,2:seed,3:mbw,4:model_A,5:model_B,6:model_C]
msettings=[0 weights_a,1 weights_b,2 model_a,3 model_b,4 model_c,5 base_alpha,6 base_beta,7 mode,8 useblocks,9 custom_name,10 save_sets,11 id_sets,12 wpresets]
id sets "image", "PNG info","XY grid"
hear = False hearm = False non3 = [None]*3
def caster(news,hear): if hear: print(news)
def casterr(*args,hear=hear): if hear: names = {id(v): k for k, v in currentframe().f_back.f_locals.items()} print('\n'.join([names.get(id(arg), '???') + ' = ' + repr(arg) for arg in args]))
msettings=[weights_a,weights_b,model_a,model_b,model_c,device,base_alpha,base_beta,mode,loranames,useblocks,custom_name,save_sets,id_sets,wpresets,deep]
def smergegen(weights_a,weights_b,model_a,model_b,model_c,base_alpha,base_beta,mode,useblocks,custom_name,save_sets,id_sets,wpresets,deep,esettings, prompt,nprompt,steps,sampler,cfg,seed,w,h,currentmodel,imggen):
NUM_INPUT_BLOCKS = 12 NUM_MID_BLOCK = 1 NUM_OUTPUT_BLOCKS = 12 NUM_TOTAL_BLOCKS = NUM_INPUT_BLOCKS + NUM_MID_BLOCK + NUM_OUTPUT_BLOCKS blockid=["BASE","IN00","IN01","IN02","IN03","IN04","IN05","IN06","IN07","IN08","IN09","IN10","IN11","M00","OUT00","OUT01","OUT02","OUT03","OUT04","OUT05","OUT06","OUT07","OUT08","OUT09","OUT10","OUT11"]
def smerge(weights_a,weights_b,model_a,model_b,model_c,base_alpha,base_beta,mode,useblocks,custom_name,save_sets,id_sets,wpresets,deep,deepprint = False): caster("merge start",hearm) global hear global mergedmodel
def load_model_weights_m(model,model_a,model_b,save): checkpoint_info = sd_models.get_closet_checkpoint_match(model) sd_model_name = checkpoint_info.model_name
def makemodelname(weights_a,weights_b,model_a, model_b,model_c, alpha,beta,useblocks,mode): model_a=filenamecutter(model_a) model_b=filenamecutter(model_b) model_c=filenamecutter(model_c)
path_root = scripts.basedir()
def rwmergelog(mergedname = "",settings= [],id = 0): setting = settings.copy() filepath = os.path.join(path_root, "mergehistory.csv") is_file = os.path.isfile(filepath) if not is_file: with open(filepath, 'a') as f:
msettings=[0 weights_a,1 weights_b,2 model_a,3 model_b,4 model_c,5 base_alpha,6 base_beta,7 mode,8 useblocks,9 custom_name,10 save_sets,11 id_sets,12 wpresets]
def draw_origin(grid, text,width,height,width_one): grid_d= Image.new("RGB", (grid.width,grid.height), "white") grid_d.paste(grid,(0,0)) def get_font(fontsize): try: return ImageFont.truetype(opts.font or Roboto, fontsize) except Exception: return ImageFont.truetype(Roboto, fontsize) d= ImageDraw.Draw(grid_d) color_active = (0, 0, 0) fontsize = (width+height)//25 fnt = get_font(fontsize)
def wpreseter(w,presets): if "," not in w and w != "": presets=presets.splitlines() wdict={} for l in presets: if ":" in l : key = l.split(":",1)[0] wdict[key.strip()]=l.split(":",1)[1] if "\t" in l: key = l.split("\t",1)[0] wdict[key.strip()]=l.split("\t",1)[1] if w.strip() in wdict: name = w w = wdict[w.strip()] print(f"weights {name} imported from presets : {w}") return w
def fullpathfromname(name): if hash == "" or hash ==[]: return "" checkpoint_info = sd_models.get_closet_checkpoint_match(name) return checkpoint_info.filename
def namefromhash(hash): if hash == "" or hash ==[]: return "" checkpoint_info = sd_models.get_closet_checkpoint_match(hash) return checkpoint_info.model_name
def hashfromname(name): from modules import sd_models if name == "" or name ==[]: return "" checkpoint_info = sd_models.get_closet_checkpoint_match(name) if checkpoint_info.shorthash is not None: return checkpoint_info.shorthash return checkpoint_info.calculate_shorthash()
def simggen(prompt, nprompt, steps, sampler, cfg, seed, w, h,mergeinfo="",id_sets=[],modelid = "no id"): shared.state.begin() p = processing.StableDiffusionProcessingTxt2Img( sd_model=shared.sd_model, do_not_save_grid=True, do_not_save_samples=True, do_not_reload_embeddings=True, ) p.batch_size = 1 p.prompt = prompt p.negative_prompt = nprompt p.steps = steps p.sampler_name = sd_samplers.samplers[sampler].name p.cfg_scale = cfg p.seed = seed p.width = w p.height = h p.seed_resize_from_w=0 p.seed_resize_from_h=0 p.denoising_strength=None