Jack000 / glid-3-xl-stable

stable diffusion training
MIT License
290 stars 36 forks source link

size mismatch ... copying a param with shape torch.Size([320, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([320, 4, 3, 3]) #17

Open chavinlo opened 1 year ago

chavinlo commented 1 year ago

Hello, sorry if I'm bothering you again but I have tried merging the resultant pt file (not trained from scratch, instead a few steps from your previous pt inpaint checkpoint), with the SD 1.4 model, but when loading the merged model into a webUI, auto1111 in this example, I always get the following error:


Loading weights [681cbf52] from /content/stable-diffusion-webui/models/Stable-diffusion/model.ckpt
Global Step: 470000
Traceback (most recent call last):
  File "launch.py", line 143, in <module>
    start_webui()
  File "launch.py", line 139, in start_webui
    import webui
  File "/content/stable-diffusion-webui/webui.py", line 78, in <module>
    shared.sd_model = modules.sd_models.load_model()
  File "/content/stable-diffusion-webui/modules/sd_models.py", line 147, in load_model
    load_model_weights(sd_model, checkpoint_info.filename, checkpoint_info.hash)
  File "/content/stable-diffusion-webui/modules/sd_models.py", line 127, in load_model_weights
    model.load_state_dict(sd, strict=False)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1605, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for LatentDiffusion:
    size mismatch for model.diffusion_model.input_blocks.0.0.weight: copying a param with shape torch.Size([320, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([320, 4, 3, 3]).

I think this has something to do with the architecture your trainer and the normal finetuning use? I'm not an expert.

Jack000 commented 1 year ago

the inpaint model isn't compatible with other SD tools unfortunately. The unet has a slightly different architecture.