ogkalu2 / Merge-Stable-Diffusion-models-without-distortion

Adaptation of the merging method described in the paper - Git Re-Basin: Merging Models modulo Permutation Symmetries (https://arxiv.org/abs/2209.04836) for Stable Diffusion
MIT License
139 stars 21 forks source link

After running for a while, get error #1

Open Dawgmastah opened 1 year ago

Dawgmastah commented 1 year ago

Error is:

0/P_model.diffusion_model.output_blocks.6.0_inner3: 0.0 0/P_model.diffusion_model.output_blocks.4.0_inner2: 0.0 0/P_bg371: 0.0 0/P_bg206: 0.0 0/P_model.diffusion_model.output_blocks.6.0_inner2: 0.0

Traceback (most recent call last): File "X:\AIMODELS\SD_rebasin_merge.py", line 27, in updated_params = unflatten_params(apply_permutation(permutation_spec, final_permutation, flatten_params(state_b))) File "X:\AIMODELS\weight_matching.py", line 786, in apply_permutation return {k: get_permuted_param(ps, perm, k, params) for k in params.keys()} File "X:\AIMODELS\weight_matching.py", line 786, in return {k: get_permuted_param(ps, perm, k, params) for k in params.keys()} File "X:\AIMODELS\weight_matching.py", line 773, in get_permuted_param for axis, p in enumerate(ps.axes_to_perm[k]): KeyError: 'first_stage_model.decoder.conv_in.weight'

ogkalu2 commented 1 year ago

@Dawgmastah Damn i know exactly what this error means. You were so close to getting it finished too. I put encoder instead of decoder on one of the layers. I've fixed it. You can re-download weight_matching and try it again

How long did it run and what was the size of the models you were merging ?

ogkalu2 commented 1 year ago

At least with this, I now know for sure the permutation spec itself will run without error

Dawgmastah commented 1 year ago

It ran for about...4 minutes? 4gb size models

ogkalu2 commented 1 year ago

It ran for about...4 minutes? 4gb size models

Really? Huh. What CPU are you running on ? Can you try again then and see if it works then and/or takes longer than 4 minutes ?

Dawgmastah commented 1 year ago

I have a Ryzen 5 5600x, running again right now, will report back

ogkalu2 commented 1 year ago

I have a Ryzen 5 5600x, running again right now, will report back

Thanks. Also, how much RAM do you have ?

Dawgmastah commented 1 year ago

32gb of which 31 i sin use haha

Dawgmastah commented 1 year ago

If relevant (but I think not cuz jax I think doesnt support cuda in windows, have a Quadro P6000 with 24GB mem

Dawgmastah commented 1 year ago

New error, about like 6 mins of running:

0/P_model.diffusion_model.input_blocks.5.0_inner: 0.0 0/P_bg105: 0.0 0/P_bg65: 0.0 0/P_bg398: 0.0 0/P_model.diffusion_model.middle_block.2_inner2: 0.0 0/P_bg229: 0.0 Saving... Traceback (most recent call last): File "X:\AIMODELS\SD_rebasin_merge.py", line 49, in "state_dict": state_b(updated_params) TypeError: 'dict' object is not callable

ogkalu2 commented 1 year ago

Ok. boggling my mind your system is running through this in minutes lol. What models are you merging, may i ask ?

Anyway it finished permutating and it seems i can't save the new parameters with that line of code. Hold on, let me think on what to change.

Dawgmastah commented 1 year ago

Two dreambooth models based on f222, was testing to see if dreambooth training is retained.

Im a super noob with python, so I hope I installed dependencies correctly? =S

It sounds fast but the computer bogs down HARD and fans ramp way up.

Dawgmastah commented 1 year ago

DIsclaimer, I have one of the models open Does the code try to save on top and thats why it fails?

(I thought it wouldnt touch the models so hopefully its not that, would hate to have to retrain)

ogkalu2 commented 1 year ago

DIsclaimer, I have one of the models open Does the code try to save on top and thats why it fails?

(I thought it wouldnt touch the models so hopefully its not that, would hate to have to retrain)

No this wouldn't be why, don't worry. It's on my end.

Dawgmastah commented 1 year ago

I figured it out :) and sent a change request? (Sorry im new at git)

After a quick google search, just needed to change to square brackets.

HOWEVER, Its not saving the final ckpt, or at least not in the same folder the models are

The console outputs a list of numbers so large I cant read the beggining of them, section:

                bias: tensor([-3.0987e-01,  8.6523e-02, -1.7788e-01, -2.1266e-01,  1.0433e-02,
                        -2.1791e-01,  2.6793e-02, -1.2393e-01,  1.1371e-01, -1.0863e-01,
                        -2.1685e-01, -1.5029e-01, -1.6783e-01, -3.8850e-01,  4.3459e-02,
                        -1.6929e-01, -1.0295e-01, -1.1152e-01, -2.8357e-02, -7.3896e-02,
                        -2.0812e-01, -1.5632e-01, -5.8481e-02, -1.1856e-01, -9.1254e-02,
                        -1.2470e-01, -1.6789e-01, -1.8834e-01, -2.8817e-01, -6.2192e-02,
                         5.9123e-02, -2.3667e-02, -1.3297e-01, -3.5100e-02, -1.3112e-01,
                        -3.7018e-01, -3.8036e-01, -2.0638e-01, -5.9365e-02,  4.7976e-02,
                        -1.1953e-01, -1.6194e-01, -1.1945e-01, -1.6272e-01, -1.9276e-01,
                        -3.2726e-01,  6.2263e-02, -1.8931e-01, -2.8477e-01, -1.1572e-01,
                        -3.1849e-02, -1.3312e-01, -3.0812e-01, -9.2020e-02, -3.1609e-01,
                         1.7196e-03, -3.7421e-02, -2.1888e-01, -3.0839e-01,  1.1543e-01,
                         7.7973e-02, -1.3055e-01, -1.4572e-01, -2.2824e-01, -1.6274e-01,
                        -2.2910e-01, -2.6770e-01, -1.9929e-01, -2.2620e-01,  1.1451e-01,
                        -1.0981e-01, -1.2028e-01, -7.9605e-02, -3.2617e-01, -2.9925e-01,
                        -1.5122e-01, -1.4399e-02, -1.6112e-01,  1.3509e-01, -3.5821e-02,
                        -1.3658e-01, -9.8639e-02, -1.8163e-01, -8.9098e-02, -7.4511e-02,
                        -2.8328e-01,  2.9554e-01, -7.7599e-02, -2.6931e-02, -1.7998e-01,
                        -3.8744e-01,  1.9117e-02, -6.7499e-02, -1.5822e-01, -3.1369e-01,
                        -8.0714e-02,  6.7377e-02, -1.7441e-01, -2.6936e-01,  6.9966e-02,
                        -1.0376e-01, -4.4010e-01, -1.0926e-01, -4.8072e-01, -4.2980e-03,
                        -1.4702e-01,  1.6261e-01, -6.5057e-03, -1.6018e-01, -1.2537e-01,
                         2.7029e-03, -2.0469e-01, -1.9107e-01, -7.2835e-02,  2.6726e-02,
                        -2.9065e-01,  2.2019e-02, -1.0400e-01, -2.0396e-01,  1.9133e-02,
                        -5.8461e-02, -1.1128e-01, -2.1585e-01, -7.8375e-03,  8.1637e-02,
                        -4.0583e-02, -6.9815e-02,  8.1955e-02, -2.8865e-01,  5.5544e-04,
                        -7.0556e-02, -1.6712e-01, -5.4527e-02,  3.9822e-02,  9.6328e-02,
                         3.4830e-02, -2.3446e-01, -2.4861e-01,  2.3874e-02, -1.9620e-01,
                        -5.2375e-02,  2.0097e-02, -7.8893e-02, -1.7930e-01, -1.2054e-01,
                        -1.4922e-01,  1.8642e-01, -1.6072e-01, -2.8612e-01, -2.0313e-01,
                         3.8223e-02, -2.5068e-01, -2.7565e-01, -1.1056e-01, -1.0901e-01,
                         1.3730e-02, -2.8739e-01, -3.2935e-01, -2.2447e-01, -1.6618e-01,
                        -1.8390e-01, -4.9133e-02, -2.2032e-01, -2.8065e-01, -5.6750e-02,
                         1.0501e-01, -7.9150e-02,  3.6840e-02, -1.8548e-01, -2.9712e-01,
                         7.4800e-02,  6.0209e-02, -1.4950e-01,  1.3011e-03, -1.2946e-01,
                        -1.4002e-01, -1.9994e-01, -1.3638e-01, -5.3708e-02,  9.3522e-02,
                        -4.0003e-01, -8.2251e-02,  1.6086e-03,  2.5350e-02, -1.3271e-01,
                        -5.0345e-02, -2.4330e-01,  2.7859e-01, -3.4597e-02,  9.8417e-02,
                        -3.0720e-01, -3.4214e-01, -2.9323e-02, -3.9527e-02, -6.8806e-02,
                        -4.2726e-02,  1.3732e-01, -1.1492e-01, -2.3643e-01,  2.0057e-01,
                        -2.0035e-01, -1.8432e-01, -1.9740e-01, -3.0165e-02, -8.7408e-02,
                        -2.0924e-01,  1.5537e-02,  1.2454e-02, -2.1949e-01, -3.3038e-01,
                         1.3702e-01, -2.0801e-01, -3.2381e-01, -1.1145e-01, -3.3381e-01,
                        -2.8356e-01, -2.3264e-01, -1.6793e-01, -3.2767e-01,  3.8055e-02,
                         1.3719e-01,  1.4513e-01, -6.2142e-02,  5.7752e-02, -3.8613e-02,
                         3.8120e-02, -2.9842e-01, -1.8017e-01, -1.9084e-01, -5.9131e-02,
                        -9.7629e-02, -2.6879e-01, -1.6982e-01, -1.9598e-01, -1.6486e-01,
                        -4.4345e-01,  1.3145e-01, -1.2527e-01, -3.5286e-01, -1.4153e-01,
                         4.1872e-02, -5.1498e-02,  4.4923e-02, -1.4314e-01, -1.5322e-01,
                        -1.3736e-01, -1.1714e-01, -1.3404e-01, -1.3179e-01,  3.6046e-02,
                         1.4461e-02, -1.3816e-01, -1.1944e-01, -2.2131e-01, -1.7971e-01,
                        -2.3255e-01, -2.4062e-02, -2.7037e-01, -3.3867e-01, -2.0602e-01,
                        -3.2556e-01, -4.5241e-01,  1.7338e-01,  3.4252e-02, -2.7457e-01,
                        -4.0186e-02, -1.5046e-01, -2.0331e-01,  8.5551e-02,  6.9061e-02,
                         3.9149e-02,  6.1847e-02,  2.4330e-02, -8.5255e-02,  4.3975e-03,
                        -1.0956e-01, -1.4899e-01, -4.7346e-02,  1.4172e-01, -1.8808e-01,
                        -3.2834e-02, -2.2659e-01, -3.0285e-01,  9.6784e-03, -1.7468e-01,
                        -9.1025e-02, -2.0611e-01,  1.4187e-01, -5.3742e-02,  7.9878e-02,
                        -2.9439e-01, -1.2690e-01,  8.3145e-02, -2.6411e-01, -4.8809e-02,
                        -5.0545e-02,  7.4534e-03, -1.9724e-01, -1.7990e-01, -6.2929e-03,
                        -2.3463e-01,  1.5718e-01, -5.0501e-02, -3.3855e-01,  5.2244e-02,
                        -9.6106e-02, -1.5336e-01, -2.3889e-01, -1.5960e-01, -1.6674e-01,
                        -1.1221e-01, -3.7613e-01, -7.4576e-02,  4.3615e-02, -1.5194e-01,
                        -1.5814e-03,  1.6155e-01, -2.1421e-01, -1.8157e-01, -3.0741e-02,
                        -2.7048e-01, -1.2208e-01, -1.5815e-01, -3.8971e-02, -4.9434e-01,
                        -1.1177e-01, -2.8876e-01, -1.2057e-01, -6.5652e-02, -1.4939e-01,
                        -2.4235e-01, -2.6431e-01, -2.7817e-01, -6.0182e-02,  7.3723e-02,
                         6.7897e-02, -2.6950e-01, -4.5472e-02,  2.3497e-02,  1.1032e-02,
                        -2.1325e-02, -2.7592e-01, -2.1447e-01,  1.3605e-01, -2.0815e-01,
                         1.1372e-01,  8.0162e-03,  8.1407e-02,  1.1114e-01,  2.5432e-01,
                         7.3645e-02, -2.1237e-01, -2.3944e-01, -1.1531e-02, -1.9158e-01,
                        -1.9310e-01, -4.4735e-01, -8.3117e-02, -6.5378e-02, -5.8157e-02,
                        -1.2692e-01, -1.0078e-01, -7.0584e-02, -3.4128e-01, -1.4175e-01,
                        -4.1983e-01, -1.5943e-01,  2.2714e-02, -2.0087e-01, -1.7304e-01,
                        -1.6945e-01, -1.4117e-01, -9.6423e-02, -1.6080e-01, -3.6264e-01,
                        -2.5970e-01,  6.2786e-02, -1.1304e-01, -2.1170e-02, -7.2662e-03,
                        -1.3069e-01, -2.6960e-01, -2.2312e-01, -6.9882e-02,  8.7028e-02,
                        -2.5785e-01, -1.4859e-01,  8.9454e-02,  3.4363e-03, -2.6966e-01,
                        -1.6792e-01, -1.9142e-01, -7.5484e-02, -5.7128e-02, -2.5327e-01,
                         3.0796e-02, -2.6357e-01, -7.0533e-02, -2.5131e-01, -3.5937e-01,
                        -2.9414e-02, -9.2701e-02, -5.1031e-02, -1.1398e-01, -1.0901e-01,
                        -2.8701e-01, -1.1415e-01, -3.5410e-02, -1.9767e-01,  1.1847e-01,
                        -2.8780e-01, -4.3951e-01, -3.4111e-02,  9.0279e-02, -1.3108e-01,
                        -8.4070e-02, -1.8870e-01, -2.6137e-01,  1.6600e-01, -1.2377e-01,
                        -1.5874e-01, -1.6672e-01,  1.6264e-02, -1.5671e-01, -1.6064e-01,
                        -2.0873e-04,  2.3275e-02, -5.4991e-02, -8.1808e-02, -2.5473e-01,
                        -1.2953e-01, -1.5717e-01,  7.7317e-02, -1.6179e-01, -2.2566e-01,
                         2.7092e-01, -1.1891e-01, -2.3111e-02, -2.7891e-01, -2.6369e-01,
                        -1.0474e-01, -1.3952e-01, -6.7496e-02, -1.4550e-02,  1.3849e-02,
                        -3.4443e-02,  9.0194e-02, -2.1224e-01,  8.5566e-02, -1.6430e-01,
                        -3.2888e-02, -2.3083e-01, -3.2345e-01,  2.6360e-02, -1.7444e-01,
                        -2.2474e-01, -3.5428e-01, -1.2682e-01, -1.1760e-01, -2.7701e-01,
                        -2.5114e-01, -1.0184e-01, -3.9339e-01, -4.8137e-02, -2.9628e-01,
                        -7.3194e-02, -8.7117e-03,  3.2682e-02, -1.6925e-01, -2.9404e-01,
                         1.8256e-01, -1.4778e-01, -2.9238e-01, -3.0970e-02, -1.5845e-01,
                        -2.4945e-01, -2.5298e-01, -2.9055e-01,  9.9022e-02, -2.4725e-01,
                         4.4415e-02,  2.9376e-02,  1.8569e-01, -9.1296e-02,  2.0907e-02,
                        -1.6864e-01, -2.2071e-01, -6.8693e-02,  6.7075e-02, -2.0107e-01,
                        -1.2694e-02, -4.1751e-02, -1.6848e-01, -4.1016e-02, -1.1628e-01,
                        -2.4696e-01, -2.1704e-01, -2.5655e-01, -7.6091e-02,  7.5843e-02,
                        -5.2961e-02, -2.1992e-01, -2.1120e-01, -8.2721e-02, -2.9244e-01,
                        -1.1586e-01, -3.7300e-02,  2.2141e-01, -1.1169e-01, -1.3119e-01,
                        -4.5066e-01,  9.1576e-02, -2.5527e-01, -1.1095e-01, -1.4030e-01,
                         2.8040e-02, -5.5642e-02, -1.1958e-01, -8.9033e-02,  4.4582e-02,
                         1.8119e-01,  6.9940e-02, -2.9449e-01, -3.8626e-01, -3.2270e-01,
                        -1.5139e-01,  7.4740e-02, -3.2899e-01,  9.0951e-02, -9.4825e-02,
                        -1.6882e-01, -4.3424e-01, -1.8374e-01, -1.7247e-01, -4.3496e-02,
                        -3.0509e-01, -1.3830e-02, -1.3383e-01,  2.3419e-02, -1.9698e-01,
                         1.1187e-01, -1.0724e-01, -5.1580e-02,  1.0215e-02, -8.0977e-02,
                        -5.7055e-02, -2.6385e-01, -1.8458e-01,  2.4214e-02, -3.0900e-01,
                        -9.7419e-02, -1.3441e-01, -1.3102e-01, -4.0801e-02, -2.8431e-01,
                        -1.1954e-01,  2.6608e-01, -2.9547e-01, -4.1846e-01, -4.4104e-02,
                        -3.3774e-02,  5.1186e-02, -1.0776e-01,  2.5581e-03, -9.9388e-03,
                        -1.6892e-01, -1.6741e-01,  7.2080e-02, -1.4525e-01, -3.2948e-01,
                        -4.8981e-02,  2.5131e-02, -2.9171e-01, -2.3554e-01, -9.6771e-02,
                         5.2896e-02,  5.9709e-02, -7.1604e-02, -5.2237e-02,  4.6758e-02,
                        -1.0361e-01, -3.3403e-01,  1.9951e-02, -1.9932e-01, -1.4250e-01,
                        -4.2914e-01, -2.0556e-01, -1.2320e-01,  2.5005e-02, -3.1144e-01,
                        -3.5583e-02,  1.4843e-01, -6.8380e-02, -8.9234e-02, -4.3094e-01,
                         3.2095e-02, -5.7767e-02, -1.0824e-01,  7.9384e-02,  6.0338e-02,
                         9.1261e-02, -8.4031e-02, -1.6428e-01, -1.8853e-01, -1.1170e-01,
                        -1.6105e-02, -2.7527e-01, -3.0244e-02,  3.6514e-02, -5.5554e-02,
                        -2.3269e-01,  3.2870e-03, -7.2440e-02,  1.3151e-01,  2.5387e-02,
                        -2.3452e-01,  1.2775e-01,  1.0568e-01, -3.9449e-01, -1.4493e-01,
                        -8.7275e-02, -5.9002e-02, -1.8693e-02,  8.2573e-03, -1.3307e-01,
                         3.6013e-02, -4.8465e-01, -9.8136e-02, -2.0174e-01, -1.7783e-01,
                        -1.5704e-01, -1.6625e-01, -7.6434e-02, -1.2069e-01,  9.7653e-02,
                        -1.4855e-01,  3.9328e-02, -8.1773e-02, -5.7546e-02, -1.6295e-02,
                        -2.4690e-02, -2.3265e-01, -1.8710e-01,  5.9921e-02, -4.9496e-03,
                         1.6108e-01, -5.6126e-02, -1.2753e-01, -1.7975e-01,  1.6866e-02,
                        -7.4709e-02,  5.9407e-02,  4.5715e-02, -3.4760e-01,  1.2286e-02,
                         5.0938e-03, -7.6390e-02, -1.7440e-03, -3.1951e-01, -8.3961e-02,
                         4.2209e-02, -2.0699e-01, -1.0365e-01, -3.2504e-01, -4.9517e-02,
                        -1.5157e-01, -1.7026e-01, -1.8595e-01,  7.0735e-02, -2.3656e-01,
                        -3.6279e-02, -2.3110e-01, -2.2281e-01,  8.9785e-03, -7.0399e-02,
                        -2.1995e-01,  6.1844e-02, -1.2759e-01, -1.3119e-01,  1.4026e-02,
                        -2.7404e-01, -3.7747e-02, -3.4499e-01, -2.7239e-01,  8.0780e-02,
                        -1.5308e-02, -3.6058e-03,  2.0852e-02,  1.5135e-01, -2.4845e-01,
                        -3.2692e-01, -1.0383e-01, -1.4829e-01, -1.8967e-01, -1.6962e-01,
                         1.1707e-01,  6.0099e-02, -5.2905e-02, -3.7518e-01,  1.3252e-01,
                        -1.2926e-01, -1.7984e-01,  1.0071e-01, -1.7791e-01, -1.9077e-01,
                         2.9158e-02, -5.7264e-01, -1.4150e-01, -7.8130e-02, -2.9363e-01,
                        -2.5137e-01, -1.0383e-01, -1.6446e-01, -1.2950e-01, -6.2971e-02,
                        -7.3788e-02, -9.8917e-02,  1.4617e-02, -3.2237e-01, -2.1055e-01,
                        -3.5424e-02, -6.1269e-02, -4.4598e-03, -2.5893e-01,  1.4587e-01,
                        -6.2732e-02,  3.2466e-02, -8.3626e-02, -7.8327e-02, -5.5593e-02,
                        -2.1696e-01, -7.2670e-02, -1.0243e-01, -1.5235e-01, -5.0704e-03,
                         1.6437e-01, -1.6215e-01, -1.1638e-01, -2.6205e-01, -3.3029e-01,
                        -1.5906e-01, -3.5958e-01, -1.6522e-01,  8.1925e-03,  1.2120e-01,
                        -7.8795e-02, -2.6693e-01, -1.5132e-01,  1.1841e-01, -9.7181e-02,
                         1.5226e-01, -1.3007e-01, -3.0890e-01, -8.6608e-02, -6.7686e-02,
                        -1.5554e-01, -1.3763e-01, -4.8388e-02, -1.4223e-01, -5.8454e-02,
                        -4.5170e-01, -2.7704e-01, -1.8542e-01,  1.3513e-01,  1.5085e-02,
                        -5.2106e-01, -2.6887e-01,  9.5460e-02]),
ogkalu2 commented 1 year ago

instead of changing the brackets to square, try running with state_b.update(updated_params)

Dawgmastah commented 1 year ago

On it

Dawgmastah commented 1 year ago

Update, it now outputs : Saving... Done!

HOWEVER Merge file is 1kb, when unzipping, 2 files are inside, version and data.pkl

Versions whole text is..... 3

And full contents of data is:

€}q X state_dictqNs.

ogkalu2 commented 1 year ago

After the output file line in the SD_rebasin_merge file,

paste model_b["state_dict"].update(updated_params)

then

replace torch.save etc with torch.save(model_b, output_file) run again

Dawgmastah commented 1 year ago

On it, hopefully understood the instructions correctly

ogkalu2 commented 1 year ago

Basically this. https://imgur.com/a/nrOLiMd

Dawgmastah commented 1 year ago

Then I did it right, executing

Dawgmastah commented 1 year ago

Ok progress, merged model is now double the initial ones (8gb), testing if it works

ogkalu2 commented 1 year ago

You might need to prune it to load it in automatic's UI (if that's what you're using) and some other UI's. Well finger's crossed man haha

Dawgmastah commented 1 year ago

Indeed it wont load, how does one prune a model? =S

Bunch of errors including the safety checker:

Loading weights [0cdcfbbe] from X:\AIMODELS\merged.ckpt Error verifying pickled file from X:\AIMODELS\merged.ckpt: Traceback (most recent call last): File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\safe.py", line 131, in load_with_extra check_pt(filename, extra_handler) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\safe.py", line 89, in check_pt unpickler.load() File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\safe.py", line 62, in find_class raise Exception(f"global '{module}/{name}' is forbidden") Exception: global 'flax.core.frozen_dict/FrozenDict' is forbidden

The file may be malicious, so the program is not going to read it. You can skip this check with --disable-safe-unpickle commandline argument.

Traceback (most recent call last): File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\gradio\routes.py", line 284, in run_predict output = await app.blocks.process_api( File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\gradio\blocks.py", line 982, in process_api result = await self.call_function(fn_index, inputs, iterator) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\gradio\blocks.py", line 824, in call_function prediction = await anyio.to_thread.run_sync( File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run result = context.run(func, args) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\ui.py", line 1636, in fn=lambda value, k=k: run_settings_single(value, key=k), File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\ui.py", line 1478, in run_settings_single opts.data_labels[key].onchange() File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\webui.py", line 41, in f res = func(args, **kwargs) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\webui.py", line 83, in shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: modules.sd_models.reload_model_weights())) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\sd_models.py", line 288, in reload_model_weights load_model_weights(sd_model, checkpoint_info) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\sd_models.py", line 176, in load_model_weights if "global_step" in pl_sd: TypeError: argument of type 'NoneType' is not iterable

ogkalu2 commented 1 year ago

I've just uploaded a prune.py script ( wasn't written by me). place it in the directory of the merged model. run python prune.py (in a console you can interact with, cmd is fine) because It will ask some questions. Where the file is, where you want it and what to name it. don't forget to put .ckpt in at the end of the names

Dawgmastah commented 1 year ago

Error when pruning:

(merging) X:\AIMODELS>python prune.py prunin' in path: X://AIMODELS//mergeds.ckpt dict_keys(['state_dict']) removing optimizer states for path X://AIMODELS//mergeds.ckpt Traceback (most recent call last): File "X:\AIMODELS\prune.py", line 61, in prune_it(ckpt) File "X:\AIMODELS\prune.py", line 36, in prune_it new_sd[k] = sd[k].half() AttributeError: 'FrozenDict' object has no attribute 'half'

Dawgmastah commented 1 year ago

Perhaps its saving stuff it shouldnt into the ckpt? Because automati complained as well of finding "unsecure" code also related to FrozenDict

ogkalu2 commented 1 year ago

Okay let's go back a few steps then. May have been overthinking some things. Use the old SD_rebasin_merge.py i uploaded

In the torch.save line, have it be torch.save({ "state_dict": updated_params }, output_file)

instead and run it again.

Dawgmastah commented 1 year ago

On it

lopho commented 1 year ago

flax.core.frozen_dict/FrozenDict AttributeError: 'FrozenDict' object has no attribute 'half' the weights are not in pytorch format anymore. but flax. You either need a custom loader, or load via flax then convert to pytorch.

Dawgmastah commented 1 year ago

The filesize was now more proper, at the same 4gb expected

But error is the same

Loading weights [8a5ea57f] from X:\AIMODELS\merged.ckpt Error verifying pickled file from X:\AIMODELS\merged.ckpt: Traceback (most recent call last): File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\safe.py", line 131, in load_with_extra check_pt(filename, extra_handler) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\safe.py", line 89, in check_pt unpickler.load() File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\safe.py", line 62, in find_class raise Exception(f"global '{module}/{name}' is forbidden") Exception: global 'flax.core.frozen_dict/FrozenDict' is forbidden

The file may be malicious, so the program is not going to read it. You can skip this check with --disable-safe-unpickle commandline argument.

Traceback (most recent call last): File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\gradio\routes.py", line 284, in run_predict output = await app.blocks.process_api( File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\gradio\blocks.py", line 982, in process_api result = await self.call_function(fn_index, inputs, iterator) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\gradio\blocks.py", line 824, in call_function prediction = await anyio.to_thread.run_sync( File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run result = context.run(func, args) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\ui.py", line 1636, in fn=lambda value, k=k: run_settings_single(value, key=k), File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\ui.py", line 1478, in run_settings_single opts.data_labels[key].onchange() File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\webui.py", line 41, in f res = func(args, **kwargs) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\webui.py", line 83, in shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: modules.sd_models.reload_model_weights())) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\sd_models.py", line 288, in reload_model_weights load_model_weights(sd_model, checkpoint_info) File "Z:\SUPERSTABLE\Autostable-diffusion-webui-master\modules\sd_models.py", line 176, in load_model_weights if "global_step" in pl_sd: TypeError: argument of type 'NoneType' is not iterable

Dawgmastah commented 1 year ago

flax.core.frozen_dict/FrozenDict AttributeError: 'FrozenDict' object has no attribute 'half' the weights are not in pytorch format anymore. but flax. You either need a custom loader, or load via flax then convert to pytorch.

This is beyond my understanding, I'm just a tester of the tool

@ogkalu2 Looks like we have a possible root cause

ogkalu2 commented 1 year ago

@Dawgmastah Okay, that's an improvement believe it or not. Now I need to either do what lopho says or rewrite the flatten and unflatten functions in pytorch.

@lopho I see. I think the easiest solution if possible would be just to rewrite the flatten and unflatten functions in pytorch as that's where it's getting converted

lopho commented 1 year ago

the state dict is already flat, if you disregard the outermost dict.

state_dict = torch.load('model.pt')['state_dict']
for k in state_dict:
    print(k)
ogkalu2 commented 1 year ago

@lopho yes thank you

@Dawgmastah download the new files (SD_rebasin_merge and weight_matching) and run again

Dawgmastah commented 1 year ago

On it By the way, what kind of runtime were you expecting? Since you sounded surprised it runs so fast

ogkalu2 commented 1 year ago

On it By the way, what kind of runtime were you expecting? Since you sounded surprised it runs so fast

It's not so much what i was expecting but what i was getting. It took ~12 hours to run the first iteration on my vast.ai instance. There's someone else here who kept getting Out of memory errors on his system with 32 GB RAM on 4gb models so when i first saw your error, i thought you'd run it for hours just to get an error caused by a stupid spelling mistake lol.

Dawgmastah commented 1 year ago

Interesting!

It LOADS!!!

So initial feedback

It seems the features and activation word of model B are greatly preserved. However model A looks like it had its training destroyed (or is not being taken into account at all)

@ogkalu2 So it looks promising, but maybe some error in the formula for model a is overwriting it? Its pretty much gone.

Ill do some tests merging model a with NAI to see if everything goes cartoony and only the dreambooth training is getting killed

Dawgmastah commented 1 year ago

Scratch that, I cant merge with NAI model, error is as follows:

Traceback (most recent call last): File "X:\AIMODELS\SD_rebasin_merge.py", line 23, in final_permutation = weight_matching(permutation_spec, state_a, state_b) File "X:\AIMODELS\weight_matching.py", line 780, in weight_matching perm_sizes = {p: params_a[axes[0][0]].shape[axes[0][1]] for p, axes in ps.perm_to_axes.items()} File "X:\AIMODELS\weight_matching.py", line 780, in perm_sizes = {p: params_a[axes[0][0]].shape[axes[0][1]] for p, axes in ps.perm_to_axes.items()} KeyError: 'cond_stage_model.transformer.text_model.embeddings.token_embedding.weight'

What ill do is merge model B with any other, and see if the same images as with B alone are recreated using same promt and seed (meaning it was pretty much untouched)

Dawgmastah commented 1 year ago

Dreambooth has additional values for the token used to specify the subject. To make this cross DB / non-DB compatible, one would have to skip mismatched weights. Makes it harder to debug though as it would not fail if some other keys mismatched.

Ok, summarizing: I merged DB model A with DB model B, only features from model B and token were kept Tried NAI + DB model B = fail

Currently im merging Berrymix(Which uses a DB model as one of its recipes) with DB Model B nad its working

So right now if I understand correctly the code will be able to merge DB + DB (Or DB merge) and also Base model + base model (Such as 1.4 and NAI) but not between them?

Maybe a workaround I could try is doing an AUTOMATIC merge of the model you want with DB data(like NAI), at 99% NAI

Dawgmastah commented 1 year ago

Shouldnt DB + DB keep both tokens though?

Dawgmastah commented 1 year ago

Yeah... @lopho , @ogkalu2 I just confirmed the hash of the merge is exactly same as model B. So somewhere in the process model A is not being considered I think

ogkalu2 commented 1 year ago

NAI is failing because the other models have an extra layer that NAI does not. As it's running through the permutation spec, it can't find the designation for the 'cond_stage_model.transformer.text_model.embeddings.token_embedding' key/layer in NAI's state_dict

ogkalu2 commented 1 year ago

Now i'm wondering if this is specific to NAI or dreambooth. I mean, is that a dreambooth layer. If you try merging the base model untouched and a dreambooth model, do you get the same error ?

ghost commented 1 year ago

NAI is failing because the other models have an extra layer that NAI does not. As it's running through the permutation spec, it can't find the designation for the 'cond_stage_model.transformer.text_model.embeddings.token_embedding' key/layer in NAI's state_dict

iirc, when I tried running this it was because it wasn't classified under text_model, so the correct key for the NAI model is: cond_stage_model.transformer.embeddings.token_embedding

ogkalu2 commented 1 year ago

Yeah... @lopho , @ogkalu2 I just confirmed the hash of the merge is exactly same as model B. So somewhere in the process model A is not being considered I think

Hmm

ogkalu2 commented 1 year ago

NAI is failing because the other models have an extra layer that NAI does not. As it's running through the permutation spec, it can't find the designation for the 'cond_stage_model.transformer.text_model.embeddings.token_embedding' key/layer in NAI's state_dict

iirc, when I tried running this it was because it wasn't classified under text_model, so the correct key for the NAI model is: cond_stage_model.transformer.embeddings.token_embedding

Oh I see. Thank you

Dawgmastah commented 1 year ago

I will try that next Weird thing though, even though the hash is the same the output pictures are not exactly the same but very very close Like needing to tab between images close:

merged baseDB

In some other images where she looks nothing like the training, they were pretty much identical

ogkalu2 commented 1 year ago

I will try that next Weird thing though, even though the hash is the same the output pictures are not exactly the same but very very close Like needing to tab between images close:

merged baseDB

In some other images where she looks nothing like the training, they were pretty much identical

Oh you don't need to bother. Looks like NAI renamed the same layer in its state dict for some reason. So you shouldn't get any errors.

ogkalu2 commented 1 year ago

@Dawgmastah When you run this...do you see all the layers permutated ? For example, when i run it....i'd see 0/P_bg57 0/P_bg300 and so on till all the 407 axes were covered for 1 iteration. I know this runs fast for you but does that still happen ?