kohya-ss / sd-scripts

Apache License 2.0
5.13k stars 855 forks source link

LoHA error when using preset Full. RuntimeError: The size of tensor a (32) must match the size of tensor b (30) at non-singleton dimension 3 #1076

Open DarkAlchy opened 9 months ago

DarkAlchy commented 9 months ago

Doesn't happen with the other presets.

Traceback (most recent call last): File "F:\kohya_ss-win\sdxl_train_network.py", line 189, in <module> trainer.train(args) File "F:\kohya_ss-win\train_network.py", line 783, in train noise_pred = self.call_unet( File "F:\kohya_ss-win\sdxl_train_network.py", line 169, in call_unet noise_pred = unet(noisy_latents, timesteps, text_embedding, vector_embedding) File "F:\kohya_ss-win\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "F:\kohya_ss-win\venv\lib\site-packages\accelerate\utils\operations.py", line 680, in forward return model_forward(*args, **kwargs) File "F:\kohya_ss-win\venv\lib\site-packages\accelerate\utils\operations.py", line 668, in __call__ return convert_to_fp32(self.model_forward(*args, **kwargs)) File "F:\kohya_ss-win\venv\lib\site-packages\torch\amp\autocast_mode.py", line 14, in decorate_autocast return func(*args, **kwargs) File "F:\kohya_ss-win\library\sdxl_original_unet.py", line 1099, in forward h = call_module(module, h, emb, context) File "F:\kohya_ss-win\library\sdxl_original_unet.py", line 1088, in call_module x = layer(x, emb) File "F:\kohya_ss-win\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "F:\kohya_ss-win\library\sdxl_original_unet.py", line 343, in forward x = torch.utils.checkpoint.checkpoint(create_custom_forward(self.forward_body), x, emb, use_reentrant=USE_REENTRANT) File "F:\kohya_ss-win\venv\lib\site-packages\torch\utils\checkpoint.py", line 249, in checkpoint return CheckpointFunction.apply(function, preserve, *args) File "F:\kohya_ss-win\venv\lib\site-packages\torch\autograd\function.py", line 506, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "F:\kohya_ss-win\venv\lib\site-packages\torch\utils\checkpoint.py", line 107, in forward outputs = run_function(*args) File "F:\kohya_ss-win\library\sdxl_original_unet.py", line 339, in custom_forward return func(*inputs) File "F:\kohya_ss-win\library\sdxl_original_unet.py", line 331, in forward_body return x + h RuntimeError: The size of tensor a (32) must match the size of tensor b (30) at non-singleton dimension 3

DarkAlchy commented 9 months ago

I had to roll back to Dec 4 version of the gui (not sure what version it is using of the scripts since it is just the gui for this) before the error went away. This was a new error post forced update from bmalt's gui.

kohya-ss commented 9 months ago

LoHA is not from this repository, so I'm not sure, but it looks like the image size is not divisible by 32. check resolution and/or bucket_reso_step options.

DarkAlchy commented 9 months ago

LoHA is not from this repository, so I'm not sure, but it looks like the image size is not divisible by 32. check resolution and/or bucket_reso_step options.

I rolled back to Dec 4 and it worked so it is in the update. Same dataset same everything only I rolled back to Dec 4 of the gui (so whatever version of your scripts it has from then).

kohya-ss commented 8 months ago

Hmm sdxl_original_unet.py file last updated Nov 26. There might be a reason in another place, but I cannot found it. Could you ask in LyCORIS repo?

DarkAlchy commented 8 months ago

Being I have no idea what the scripts included in the Dec 4 edition, I can't say either. Bmalt may have left the old scripts pre Nov 26, which I bet so.

Have a way to see what version of scripts is in this?

suede299 commented 8 months ago

Being I have no idea what the scripts included in the Dec 4 edition, I can't say either. Bmalt may have left the old scripts pre Nov 26, which I bet so.

Have a way to see what version of scripts is in this?

image image This is the result of my own testing. However, the GUI does not force this dependency to be updated, and I may have run into the same error when using "Resume" or "--network weights" in my tests. But there is no problem to train a loha with Full. If you need to use extract and merge, you need to manually make the *lycoris.py in the tools folder the same as your dependency version.

DarkAlchy commented 8 months ago

There is a problem with the latest version so I will sit pat on Dec 4 edition until I am forced to upgrade at gun point since I know this is working fine.