SDXL size mismatch - Githubissues

Do I need to use any specific resolution to train for SDXL? 1024x1024? After seeing the error with juggernaut I had scaled all the images to 512x512 and it worked with dreamshaper_8, but I think it wasn't necessary, they should be scaled automatically right?

ComfyUI log with juggernaut

``` ** ComfyUI startup time: 2024-03-08 19:42:26.623486 ** Platform: Linux ** Python version: 3.11.8 (main, Mar 3 2024, 09:23:40) [GCC 13.2.0] ** Python executable: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/bin/python ** Log path: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/comfyui.log Prestartup times for custom nodes: 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI-Manager Total VRAM 12042 MB, total RAM 32019 MB xformers version: 0.0.23.post1 Set vram state to: NORMAL_VRAM Device: cuda:0 NVIDIA GeForce RTX 3060 : cudaMallocAsync VAE dtype: torch.bfloat16 Using xformers cross attention [36;20m[comfyui_controlnet_aux] | INFO -> Using ckpts path: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts[0m [36;20m[comfyui_controlnet_aux] | INFO -> Using symlinks: False[0m [36;20m[comfyui_controlnet_aux] | INFO -> Using ort providers: ['CUDAExecutionProvider', 'DirectMLExecutionProvider', 'OpenVINOExecutionProvider', 'ROCMExecutionProvider', 'CPUExecutionProvider', 'CoreMLExecutionProvider'][0m DWPose: Onnxruntime with acceleration providers detected ### Loading: ComfyUI-Manager (V2.9) ### ComfyUI Revision: 2052 [55f37baa] | Released on '2024-03-07' Import times for custom nodes: 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Image-Captioning-in-ComfyUI 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI-WD14-Tagger 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/comfyui-tooling-nodes 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/comfyui-inpaint-nodes 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI_UltimateSDUpscale 0.1 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI-Manager 0.7 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/comfyui_controlnet_aux Starting server To see the GUI go to: http://127.0.0.1:8188 [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json FETCH DATA from: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI-Manager/extension-node-map.json got prompt [] /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py The following values were not passed to `accelerate launch` and had defaults used instead: `--num_processes` was set to a value of `1` `--num_machines` was set to a value of `1` `--mixed_precision` was set to a value of `'no'` `--dynamo_backend` was set to a value of `'no'` To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`. prepare tokenizer update token length: 225 Using DreamBooth method. prepare images. found directory /home/alex/database/5_myimages contains 16 image files 80 train images with repeating. 0 reg images. no regularization images / 正則化画像が見つかりませんでした [Dataset 0] batch_size: 1 resolution: (512, 512) enable_bucket: True min_bucket_reso: 256 max_bucket_reso: 1584 bucket_reso_steps: 64 bucket_no_upscale: False [Subset 0 of Dataset 0] image_dir: "/home/alex/database/5_myimages" image_count: 16 num_repeats: 5 shuffle_caption: True keep_tokens: 0 caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 caption_prefix: None caption_suffix: None color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: myimages caption_extension: .txt [Dataset 0] loading image sizes. 0%| | 0/16 [00:00 trainer.train(args) File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py", line 228, in train model_version, text_encoder, vae, unet = self.load_target_model(args, weight_dtype, accelerator) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py", line 102, in load_target_model text_encoder, vae, unet, _ = train_util.load_target_model(args, weight_dtype, accelerator) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/library/train_util.py", line 3917, in load_target_model text_encoder, vae, unet, load_stable_diffusion_format = _load_target_model( ^^^^^^^^^^^^^^^^^^^ File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/library/train_util.py", line 3860, in _load_target_model text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/library/model_util.py", line 1007, in load_models_from_stable_diffusion_checkpoint info = unet.load_state_dict(converted_unet_checkpoint) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for UNet2DConditionModel: Missing key(s) in state_dict: "down_blocks.0.attentions.0.norm.weight", "down_blocks.0.attentions.0.norm.bias", "down_blocks.0.attentions.0.proj_in.weight", "down_blocks.0.attentions.0.proj_in.bias", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_q.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_k.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_v.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.bias", "down_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.weight", "down_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.bias", "down_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.weight", "down_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.bias", "down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_q.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.bias", "down_blocks.0.attentions.0.transformer_blocks.0.norm1.weight", "down_blocks.0.attentions.0.transformer_blocks.0.norm1.bias", "down_blocks.0.attentions.0.transformer_blocks.0.norm2.weight", "down_blocks.0.attentions.0.transformer_blocks.0.norm2.bias", "down_blocks.0.attentions.0.transformer_blocks.0.norm3.weight", "down_blocks.0.attentions.0.transformer_blocks.0.norm3.bias", "down_blocks.0.attentions.0.proj_out.weight", "down_blocks.0.attentions.0.proj_out.bias", "down_blocks.0.attentions.1.norm.weight", "down_blocks.0.attentions.1.norm.bias", "down_blocks.0.attentions.1.proj_in.weight", "down_blocks.0.attentions.1.proj_in.bias", "down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_q.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_k.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_v.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.bias", "down_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.weight", "down_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.bias", "down_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.weight", "down_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.bias", "down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_q.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_k.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_v.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.bias", "down_blocks.0.attentions.1.transformer_blocks.0.norm1.weight", "down_blocks.0.attentions.1.transformer_blocks.0.norm1.bias", "down_blocks.0.attentions.1.transformer_blocks.0.norm2.weight", "down_blocks.0.attentions.1.transformer_blocks.0.norm2.bias", "down_blocks.0.attentions.1.transformer_blocks.0.norm3.weight", "down_blocks.0.attentions.1.transformer_blocks.0.norm3.bias", "down_blocks.0.attentions.1.proj_out.weight", "down_blocks.0.attentions.1.proj_out.bias", "down_blocks.2.downsamplers.0.conv.weight", "down_blocks.2.downsamplers.0.conv.bias", "down_blocks.3.resnets.0.norm1.weight", "down_blocks.3.resnets.0.norm1.bias", "down_blocks.3.resnets.0.conv1.weight", "down_blocks.3.resnets.0.conv1.bias", "down_blocks.3.resnets.0.time_emb_proj.weight", "down_blocks.3.resnets.0.time_emb_proj.bias", "down_blocks.3.resnets.0.norm2.weight", "down_blocks.3.resnets.0.norm2.bias", "down_blocks.3.resnets.0.conv2.weight", "down_blocks.3.resnets.0.conv2.bias", "down_blocks.3.resnets.1.norm1.weight", "down_blocks.3.resnets.1.norm1.bias", "down_blocks.3.resnets.1.conv1.weight", "down_blocks.3.resnets.1.conv1.bias", "down_blocks.3.resnets.1.time_emb_proj.weight", "down_blocks.3.resnets.1.time_emb_proj.bias", "down_blocks.3.resnets.1.norm2.weight", "down_blocks.3.resnets.1.norm2.bias", "down_blocks.3.resnets.1.conv2.weight", "down_blocks.3.resnets.1.conv2.bias", "up_blocks.2.attentions.0.norm.weight", "up_blocks.2.attentions.0.norm.bias", "up_blocks.2.attentions.0.proj_in.weight", "up_blocks.2.attentions.0.proj_in.bias", "up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_q.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_k.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_v.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.weight", "up_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.bias", "up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_q.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.2.attentions.0.transformer_blocks.0.norm1.weight", "up_blocks.2.attentions.0.transformer_blocks.0.norm1.bias", "up_blocks.2.attentions.0.transformer_blocks.0.norm2.weight", "up_blocks.2.attentions.0.transformer_blocks.0.norm2.bias", "up_blocks.2.attentions.0.transformer_blocks.0.norm3.weight", "up_blocks.2.attentions.0.transformer_blocks.0.norm3.bias", "up_blocks.2.attentions.0.proj_out.weight", "up_blocks.2.attentions.0.proj_out.bias", "up_blocks.2.attentions.1.norm.weight", "up_blocks.2.attentions.1.norm.bias", "up_blocks.2.attentions.1.proj_in.weight", "up_blocks.2.attentions.1.proj_in.bias", "up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_q.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_k.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_v.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.weight", "up_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.bias", "up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_q.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.2.attentions.1.transformer_blocks.0.norm1.weight", "up_blocks.2.attentions.1.transformer_blocks.0.norm1.bias", "up_blocks.2.attentions.1.transformer_blocks.0.norm2.weight", "up_blocks.2.attentions.1.transformer_blocks.0.norm2.bias", "up_blocks.2.attentions.1.transformer_blocks.0.norm3.weight", "up_blocks.2.attentions.1.transformer_blocks.0.norm3.bias", "up_blocks.2.attentions.1.proj_out.weight", "up_blocks.2.attentions.1.proj_out.bias", "up_blocks.2.attentions.2.norm.weight", "up_blocks.2.attentions.2.norm.bias", "up_blocks.2.attentions.2.proj_in.weight", "up_blocks.2.attentions.2.proj_in.bias", "up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_q.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_k.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_v.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.2.attentions.2.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.2.attentions.2.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.2.attentions.2.transformer_blocks.0.ff.net.2.weight", "up_blocks.2.attentions.2.transformer_blocks.0.ff.net.2.bias", "up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_q.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_k.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_v.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.2.attentions.2.transformer_blocks.0.norm1.weight", "up_blocks.2.attentions.2.transformer_blocks.0.norm1.bias", "up_blocks.2.attentions.2.transformer_blocks.0.norm2.weight", "up_blocks.2.attentions.2.transformer_blocks.0.norm2.bias", "up_blocks.2.attentions.2.transformer_blocks.0.norm3.weight", "up_blocks.2.attentions.2.transformer_blocks.0.norm3.bias", "up_blocks.2.attentions.2.proj_out.weight", "up_blocks.2.attentions.2.proj_out.bias", "up_blocks.2.upsamplers.0.conv.weight", "up_blocks.2.upsamplers.0.conv.bias", "up_blocks.3.attentions.0.norm.weight", "up_blocks.3.attentions.0.norm.bias", "up_blocks.3.attentions.0.proj_in.weight", "up_blocks.3.attentions.0.proj_in.bias", "up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_q.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_k.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_v.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.3.attentions.0.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.3.attentions.0.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.3.attentions.0.transformer_blocks.0.ff.net.2.weight", "up_blocks.3.attentions.0.transformer_blocks.0.ff.net.2.bias", "up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_q.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_k.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_v.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.3.attentions.0.transformer_blocks.0.norm1.weight", "up_blocks.3.attentions.0.transformer_blocks.0.norm1.bias", "up_blocks.3.attentions.0.transformer_blocks.0.norm2.weight", "up_blocks.3.attentions.0.transformer_blocks.0.norm2.bias", "up_blocks.3.attentions.0.transformer_blocks.0.norm3.weight", "up_blocks.3.attentions.0.transformer_blocks.0.norm3.bias", "up_blocks.3.attentions.0.proj_out.weight", "up_blocks.3.attentions.0.proj_out.bias", "up_blocks.3.attentions.1.norm.weight", "up_blocks.3.attentions.1.norm.bias", "up_blocks.3.attentions.1.proj_in.weight", "up_blocks.3.attentions.1.proj_in.bias", "up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_q.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_k.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_v.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.3.attentions.1.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.3.attentions.1.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.3.attentions.1.transformer_blocks.0.ff.net.2.weight", "up_blocks.3.attentions.1.transformer_blocks.0.ff.net.2.bias", "up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_q.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_k.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_v.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.3.attentions.1.transformer_blocks.0.norm1.weight", "up_blocks.3.attentions.1.transformer_blocks.0.norm1.bias", "up_blocks.3.attentions.1.transformer_blocks.0.norm2.weight", "up_blocks.3.attentions.1.transformer_blocks.0.norm2.bias", "up_blocks.3.attentions.1.transformer_blocks.0.norm3.weight", "up_blocks.3.attentions.1.transformer_blocks.0.norm3.bias", "up_blocks.3.attentions.1.proj_out.weight", "up_blocks.3.attentions.1.proj_out.bias", "up_blocks.3.attentions.2.norm.weight", "up_blocks.3.attentions.2.norm.bias", "up_blocks.3.attentions.2.proj_in.weight", "up_blocks.3.attentions.2.proj_in.bias", "up_blocks.3.attentions.2.transformer_blocks.0.attn1.to_q.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn1.to_k.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn1.to_v.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.3.attentions.2.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.3.attentions.2.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.3.attentions.2.transformer_blocks.0.ff.net.2.weight", "up_blocks.3.attentions.2.transformer_blocks.0.ff.net.2.bias", "up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_q.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_k.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_v.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.3.attentions.2.transformer_blocks.0.norm1.weight", "up_blocks.3.attentions.2.transformer_blocks.0.norm1.bias", "up_blocks.3.attentions.2.transformer_blocks.0.norm2.weight", "up_blocks.3.attentions.2.transformer_blocks.0.norm2.bias", "up_blocks.3.attentions.2.transformer_blocks.0.norm3.weight", "up_blocks.3.attentions.2.transformer_blocks.0.norm3.bias", "up_blocks.3.attentions.2.proj_out.weight", "up_blocks.3.attentions.2.proj_out.bias", "up_blocks.3.resnets.0.norm1.weight", "up_blocks.3.resnets.0.norm1.bias", "up_blocks.3.resnets.0.conv1.weight", "up_blocks.3.resnets.0.conv1.bias", "up_blocks.3.resnets.0.time_emb_proj.weight", "up_blocks.3.resnets.0.time_emb_proj.bias", "up_blocks.3.resnets.0.norm2.weight", "up_blocks.3.resnets.0.norm2.bias", "up_blocks.3.resnets.0.conv2.weight", "up_blocks.3.resnets.0.conv2.bias", "up_blocks.3.resnets.0.conv_shortcut.weight", "up_blocks.3.resnets.0.conv_shortcut.bias", "up_blocks.3.resnets.1.norm1.weight", "up_blocks.3.resnets.1.norm1.bias", "up_blocks.3.resnets.1.conv1.weight", "up_blocks.3.resnets.1.conv1.bias", "up_blocks.3.resnets.1.time_emb_proj.weight", "up_blocks.3.resnets.1.time_emb_proj.bias", "up_blocks.3.resnets.1.norm2.weight", "up_blocks.3.resnets.1.norm2.bias", "up_blocks.3.resnets.1.conv2.weight", "up_blocks.3.resnets.1.conv2.bias", "up_blocks.3.resnets.1.conv_shortcut.weight", "up_blocks.3.resnets.1.conv_shortcut.bias", "up_blocks.3.resnets.2.norm1.weight", "up_blocks.3.resnets.2.norm1.bias", "up_blocks.3.resnets.2.conv1.weight", "up_blocks.3.resnets.2.conv1.bias", "up_blocks.3.resnets.2.time_emb_proj.weight", "up_blocks.3.resnets.2.time_emb_proj.bias", "up_blocks.3.resnets.2.norm2.weight", "up_blocks.3.resnets.2.norm2.bias", "up_blocks.3.resnets.2.conv2.weight", "up_blocks.3.resnets.2.conv2.bias", "up_blocks.3.resnets.2.conv_shortcut.weight", "up_blocks.3.resnets.2.conv_shortcut.bias". Unexpected key(s) in state_dict: "down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_k.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.bias", "down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_q.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_v.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_k.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.bias", "down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_q.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_v.weight", "down_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.bias", "down_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.weight", "down_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.bias", "down_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.weight", "down_blocks.1.attentions.0.transformer_blocks.1.norm1.bias", "down_blocks.1.attentions.0.transformer_blocks.1.norm1.weight", "down_blocks.1.attentions.0.transformer_blocks.1.norm2.bias", "down_blocks.1.attentions.0.transformer_blocks.1.norm2.weight", "down_blocks.1.attentions.0.transformer_blocks.1.norm3.bias", "down_blocks.1.attentions.0.transformer_blocks.1.norm3.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_k.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.bias", "down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_q.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_v.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_k.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.bias", "down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_q.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_v.weight", "down_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.bias", "down_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.weight", "down_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.bias", "down_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.weight", "down_blocks.1.attentions.1.transformer_blocks.1.norm1.bias", "down_blocks.1.attentions.1.transformer_blocks.1.norm1.weight", "down_blocks.1.attentions.1.transformer_blocks.1.norm2.bias", "down_blocks.1.attentions.1.transformer_blocks.1.norm2.weight", "down_blocks.1.attentions.1.transformer_blocks.1.norm3.bias", "down_blocks.1.attentions.1.transformer_blocks.1.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.1.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.1.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.1.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.1.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.1.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.1.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.1.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.1.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.1.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.1.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.2.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.2.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.2.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.2.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.2.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.2.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.2.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.2.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.2.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.2.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.3.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.3.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.3.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.3.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.3.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.3.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.3.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.3.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.3.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.3.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.4.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.4.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.4.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.4.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.4.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.4.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.4.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.4.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.4.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.4.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.5.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.5.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.5.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.5.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.5.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.5.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.5.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.5.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.5.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.5.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.6.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.6.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.6.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.6.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.6.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.6.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.6.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.6.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.6.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.6.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.7.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.7.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.7.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.7.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.7.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.7.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.7.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.7.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.7.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.7.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.8.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.8.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.8.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.8.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.8.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.8.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.8.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.8.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.8.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.8.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.9.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.9.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.9.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.9.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.9.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.9.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.9.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.9.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.9.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.9.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.1.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.1.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.1.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.1.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.1.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.1.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.1.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.1.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.1.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.1.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.1.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.1.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.2.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.2.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.2.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.2.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.2.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.2.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.2.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.2.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.2.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.2.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.2.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.2.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.3.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.3.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.3.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.3.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.3.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.3.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.3.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.3.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.3.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.3.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.3.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.3.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.4.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.4.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.4.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.4.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.4.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.4.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.4.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.4.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.4.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.4.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.4.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.4.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.5.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.5.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.5.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.5.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.5.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.5.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.5.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.5.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.5.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.5.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.5.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.5.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.6.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.6.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.6.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.6.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.6.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.6.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.6.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.6.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.6.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.6.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.6.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.6.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.7.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.7.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.7.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.7.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.7.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.7.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.7.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.7.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.7.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.7.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.7.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.7.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.8.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.8.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.8.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.8.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.8.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.8.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.8.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.8.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.8.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.8.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.8.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.8.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.9.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.9.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.9.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.9.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.9.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.9.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.9.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.9.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.9.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.9.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.9.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.9.norm3.weight", "up_blocks.0.attentions.0.norm.bias", "up_blocks.0.attentions.0.norm.weight", "up_blocks.0.attentions.0.proj_in.bias", "up_blocks.0.attentions.0.proj_in.weight", "up_blocks.0.attentions.0.proj_out.bias", "up_blocks.0.attentions.0.proj_out.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.0.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.0.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.0.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.0.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.0.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.0.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.1.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.1.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.1.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.1.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.1.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.1.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.1.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.1.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.2.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.2.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.2.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.2.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.2.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.2.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.2.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.2.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.2.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.2.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.2.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.3.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.3.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.3.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.3.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.3.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.3.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.3.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.3.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.3.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.3.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.3.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.3.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.4.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.4.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.4.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.4.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.4.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.4.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.4.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.4.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.4.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.4.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.4.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.4.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.5.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.5.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.5.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.5.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.5.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.5.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.5.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.5.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.5.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.5.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.5.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.5.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.6.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.6.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.6.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.6.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.6.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.6.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.6.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.6.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.6.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.6.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.6.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.6.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.7.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.7.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.7.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.7.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.7.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.7.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.7.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.7.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.7.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.7.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.7.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.7.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.8.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.8.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.8.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.8.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.8.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.8.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.8.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.8.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.8.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.8.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.8.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.8.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.9.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.9.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.9.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.9.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.9.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.9.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.9.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.9.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.9.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.9.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.9.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.9.norm3.weight", "up_blocks.0.attentions.1.norm.bias", "up_blocks.0.attentions.1.norm.weight", "up_blocks.0.attentions.1.proj_in.bias", "up_blocks.0.attentions.1.proj_in.weight", "up_blocks.0.attentions.1.proj_out.bias", "up_blocks.0.attentions.1.proj_out.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.0.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.0.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.0.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.0.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.0.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.0.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.1.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.1.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.1.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.1.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.1.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.1.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.1.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.1.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.2.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.2.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.2.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.2.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.2.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.2.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.2.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.2.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.2.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.2.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.2.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.3.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.3.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.3.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.3.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.3.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.3.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.3.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.3.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.3.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.3.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.3.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.3.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.4.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.4.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.4.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.4.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.4.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.4.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.4.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.4.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.4.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.4.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.4.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.4.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.5.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.5.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.5.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.5.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.5.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.5.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.5.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.5.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.5.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.5.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.5.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.5.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.6.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.6.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.6.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.6.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.6.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.6.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.6.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.6.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.6.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.6.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.6.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.6.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.7.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.7.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.7.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.7.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.7.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.7.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.7.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.7.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.7.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.7.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.7.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.7.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.8.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.8.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.8.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.8.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.8.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.8.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.8.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.8.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.8.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.8.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.8.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.8.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.9.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.9.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.9.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.9.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.9.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.9.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.9.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.9.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.9.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.9.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.9.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.9.norm3.weight", "up_blocks.0.attentions.2.norm.bias", "up_blocks.0.attentions.2.norm.weight", "up_blocks.0.attentions.2.proj_in.bias", "up_blocks.0.attentions.2.proj_in.weight", "up_blocks.0.attentions.2.proj_out.bias", "up_blocks.0.attentions.2.proj_out.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.0.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.0.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.0.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.0.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.0.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.0.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.0.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.0.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.1.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.1.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.1.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.1.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.1.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.1.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.1.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.1.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.2.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.2.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.2.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.2.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.2.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.2.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.2.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.2.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.2.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.2.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.2.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.2.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.3.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.3.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.3.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.3.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.3.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.3.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.3.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.3.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.3.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.3.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.3.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.4.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.4.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.4.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.4.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.4.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.4.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.4.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.4.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.4.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.4.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.4.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.4.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.5.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.5.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.5.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.5.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.5.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.5.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.5.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.5.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.5.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.5.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.5.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.5.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.6.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.6.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.6.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.6.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.6.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.6.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.6.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.6.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.6.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.6.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.6.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.7.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.7.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.7.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.7.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.7.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.7.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.7.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.7.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.7.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.7.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.7.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.7.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.8.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.8.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.8.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.8.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.8.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.8.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.8.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.8.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.8.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.8.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.8.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.8.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.9.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.9.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.9.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.9.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.9.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.9.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.9.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.9.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.9.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.9.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.9.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.9.norm3.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn1.to_k.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn1.to_q.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn1.to_v.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn2.to_k.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn2.to_q.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn2.to_v.weight", "up_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.bias", "up_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.weight", "up_blocks.1.attentions.0.transformer_blocks.1.norm1.bias", "up_blocks.1.attentions.0.transformer_blocks.1.norm1.weight", "up_blocks.1.attentions.0.transformer_blocks.1.norm2.bias", "up_blocks.1.attentions.0.transformer_blocks.1.norm2.weight", "up_blocks.1.attentions.0.transformer_blocks.1.norm3.bias", "up_blocks.1.attentions.0.transformer_blocks.1.norm3.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn1.to_k.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn1.to_q.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn1.to_v.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn2.to_k.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn2.to_q.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn2.to_v.weight", "up_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.bias", "up_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.weight", "up_blocks.1.attentions.1.transformer_blocks.1.norm1.bias", "up_blocks.1.attentions.1.transformer_blocks.1.norm1.weight", "up_blocks.1.attentions.1.transformer_blocks.1.norm2.bias", "up_blocks.1.attentions.1.transformer_blocks.1.norm2.weight", "up_blocks.1.attentions.1.transformer_blocks.1.norm3.bias", "up_blocks.1.attentions.1.transformer_blocks.1.norm3.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn1.to_k.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.1.attentions.2.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn1.to_q.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn1.to_v.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn2.to_k.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.1.attentions.2.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn2.to_q.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn2.to_v.weight", "up_blocks.1.attentions.2.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.1.attentions.2.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.1.attentions.2.transformer_blocks.1.ff.net.2.bias", "up_blocks.1.attentions.2.transformer_blocks.1.ff.net.2.weight", "up_blocks.1.attentions.2.transformer_blocks.1.norm1.bias", "up_blocks.1.attentions.2.transformer_blocks.1.norm1.weight", "up_blocks.1.attentions.2.transformer_blocks.1.norm2.bias", "up_blocks.1.attentions.2.transformer_blocks.1.norm2.weight", "up_blocks.1.attentions.2.transformer_blocks.1.norm3.bias", "up_blocks.1.attentions.2.transformer_blocks.1.norm3.weight", "mid_block.attentions.0.transformer_blocks.1.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.1.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.1.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.1.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.1.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.1.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.1.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.1.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.1.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.1.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.1.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.1.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.1.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.1.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.1.norm1.bias", "mid_block.attentions.0.transformer_blocks.1.norm1.weight", "mid_block.attentions.0.transformer_blocks.1.norm2.bias", "mid_block.attentions.0.transformer_blocks.1.norm2.weight", "mid_block.attentions.0.transformer_blocks.1.norm3.bias", "mid_block.attentions.0.transformer_blocks.1.norm3.weight", "mid_block.attentions.0.transformer_blocks.2.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.2.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.2.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.2.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.2.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.2.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.2.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.2.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.2.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.2.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.2.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.2.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.2.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.2.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.2.norm1.bias", "mid_block.attentions.0.transformer_blocks.2.norm1.weight", "mid_block.attentions.0.transformer_blocks.2.norm2.bias", "mid_block.attentions.0.transformer_blocks.2.norm2.weight", "mid_block.attentions.0.transformer_blocks.2.norm3.bias", "mid_block.attentions.0.transformer_blocks.2.norm3.weight", "mid_block.attentions.0.transformer_blocks.3.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.3.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.3.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.3.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.3.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.3.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.3.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.3.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.3.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.3.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.3.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.3.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.3.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.3.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.3.norm1.bias", "mid_block.attentions.0.transformer_blocks.3.norm1.weight", "mid_block.attentions.0.transformer_blocks.3.norm2.bias", "mid_block.attentions.0.transformer_blocks.3.norm2.weight", "mid_block.attentions.0.transformer_blocks.3.norm3.bias", "mid_block.attentions.0.transformer_blocks.3.norm3.weight", "mid_block.attentions.0.transformer_blocks.4.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.4.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.4.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.4.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.4.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.4.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.4.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.4.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.4.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.4.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.4.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.4.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.4.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.4.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.4.norm1.bias", "mid_block.attentions.0.transformer_blocks.4.norm1.weight", "mid_block.attentions.0.transformer_blocks.4.norm2.bias", "mid_block.attentions.0.transformer_blocks.4.norm2.weight", "mid_block.attentions.0.transformer_blocks.4.norm3.bias", "mid_block.attentions.0.transformer_blocks.4.norm3.weight", "mid_block.attentions.0.transformer_blocks.5.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.5.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.5.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.5.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.5.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.5.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.5.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.5.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.5.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.5.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.5.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.5.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.5.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.5.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.5.norm1.bias", "mid_block.attentions.0.transformer_blocks.5.norm1.weight", "mid_block.attentions.0.transformer_blocks.5.norm2.bias", "mid_block.attentions.0.transformer_blocks.5.norm2.weight", "mid_block.attentions.0.transformer_blocks.5.norm3.bias", "mid_block.attentions.0.transformer_blocks.5.norm3.weight", "mid_block.attentions.0.transformer_blocks.6.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.6.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.6.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.6.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.6.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.6.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.6.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.6.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.6.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.6.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.6.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.6.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.6.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.6.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.6.norm1.bias", "mid_block.attentions.0.transformer_blocks.6.norm1.weight", "mid_block.attentions.0.transformer_blocks.6.norm2.bias", "mid_block.attentions.0.transformer_blocks.6.norm2.weight", "mid_block.attentions.0.transformer_blocks.6.norm3.bias", "mid_block.attentions.0.transformer_blocks.6.norm3.weight", "mid_block.attentions.0.transformer_blocks.7.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.7.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.7.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.7.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.7.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.7.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.7.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.7.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.7.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.7.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.7.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.7.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.7.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.7.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.7.norm1.bias", "mid_block.attentions.0.transformer_blocks.7.norm1.weight", "mid_block.attentions.0.transformer_blocks.7.norm2.bias", "mid_block.attentions.0.transformer_blocks.7.norm2.weight", "mid_block.attentions.0.transformer_blocks.7.norm3.bias", "mid_block.attentions.0.transformer_blocks.7.norm3.weight", "mid_block.attentions.0.transformer_blocks.8.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.8.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.8.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.8.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.8.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.8.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.8.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.8.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.8.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.8.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.8.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.8.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.8.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.8.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.8.norm1.bias", "mid_block.attentions.0.transformer_blocks.8.norm1.weight", "mid_block.attentions.0.transformer_blocks.8.norm2.bias", "mid_block.attentions.0.transformer_blocks.8.norm2.weight", "mid_block.attentions.0.transformer_blocks.8.norm3.bias", "mid_block.attentions.0.transformer_blocks.8.norm3.weight", "mid_block.attentions.0.transformer_blocks.9.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.9.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.9.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.9.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.9.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.9.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.9.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.9.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.9.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.9.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.9.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.9.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.9.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.9.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.9.norm1.bias", "mid_block.attentions.0.transformer_blocks.9.norm1.weight", "mid_block.attentions.0.transformer_blocks.9.norm2.bias", "mid_block.attentions.0.transformer_blocks.9.norm2.weight", "mid_block.attentions.0.transformer_blocks.9.norm3.bias", "mid_block.attentions.0.transformer_blocks.9.norm3.weight". size mismatch for down_blocks.1.attentions.0.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for down_blocks.1.attentions.0.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for down_blocks.1.attentions.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for down_blocks.1.attentions.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for down_blocks.2.attentions.0.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for down_blocks.2.attentions.0.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for down_blocks.2.attentions.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for down_blocks.2.attentions.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.0.resnets.2.norm1.weight: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.0.resnets.2.norm1.bias: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.0.resnets.2.conv1.weight: copying a param with shape torch.Size([1280, 1920, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 3, 3]). size mismatch for up_blocks.0.resnets.2.conv_shortcut.weight: copying a param with shape torch.Size([1280, 1920, 1, 1]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 1, 1]). size mismatch for up_blocks.1.attentions.0.norm.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.norm.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.0.proj_in.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_k.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_v.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.weight: copying a param with shape torch.Size([5120, 640]) from checkpoint, the shape in current model is torch.Size([10240, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.bias: copying a param with shape torch.Size([5120]) from checkpoint, the shape in current model is torch.Size([10240]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.weight: copying a param with shape torch.Size([640, 2560]) from checkpoint, the shape in current model is torch.Size([1280, 5120]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm1.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm3.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm3.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.0.proj_out.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.norm.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.norm.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.1.proj_in.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_k.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_v.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.weight: copying a param with shape torch.Size([5120, 640]) from checkpoint, the shape in current model is torch.Size([10240, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.bias: copying a param with shape torch.Size([5120]) from checkpoint, the shape in current model is torch.Size([10240]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.weight: copying a param with shape torch.Size([640, 2560]) from checkpoint, the shape in current model is torch.Size([1280, 5120]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm1.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm3.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm3.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.1.proj_out.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.norm.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.norm.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.2.proj_in.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_k.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_v.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.ff.net.0.proj.weight: copying a param with shape torch.Size([5120, 640]) from checkpoint, the shape in current model is torch.Size([10240, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.ff.net.0.proj.bias: copying a param with shape torch.Size([5120]) from checkpoint, the shape in current model is torch.Size([10240]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.ff.net.2.weight: copying a param with shape torch.Size([640, 2560]) from checkpoint, the shape in current model is torch.Size([1280, 5120]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.ff.net.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm1.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm3.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm3.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.2.proj_out.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.norm1.weight: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.1.resnets.0.norm1.bias: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.1.resnets.0.conv1.weight: copying a param with shape torch.Size([640, 1920, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 3, 3]). size mismatch for up_blocks.1.resnets.0.conv1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.time_emb_proj.weight: copying a param with shape torch.Size([640, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.resnets.0.time_emb_proj.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.conv2.weight: copying a param with shape torch.Size([640, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 3, 3]). size mismatch for up_blocks.1.resnets.0.conv2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.conv_shortcut.weight: copying a param with shape torch.Size([640, 1920, 1, 1]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 1, 1]). size mismatch for up_blocks.1.resnets.0.conv_shortcut.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.norm1.weight: copying a param with shape torch.Size([1280]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.1.resnets.1.norm1.bias: copying a param with shape torch.Size([1280]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.1.resnets.1.conv1.weight: copying a param with shape torch.Size([640, 1280, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 3, 3]). size mismatch for up_blocks.1.resnets.1.conv1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.time_emb_proj.weight: copying a param with shape torch.Size([640, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.resnets.1.time_emb_proj.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.conv2.weight: copying a param with shape torch.Size([640, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 3, 3]). size mismatch for up_blocks.1.resnets.1.conv2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.conv_shortcut.weight: copying a param with shape torch.Size([640, 1280, 1, 1]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 1, 1]). size mismatch for up_blocks.1.resnets.1.conv_shortcut.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.norm1.weight: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([1920]). size mismatch for up_blocks.1.resnets.2.norm1.bias: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([1920]). size mismatch for up_blocks.1.resnets.2.conv1.weight: copying a param with shape torch.Size([640, 960, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 1920, 3, 3]). size mismatch for up_blocks.1.resnets.2.conv1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.time_emb_proj.weight: copying a param with shape torch.Size([640, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.resnets.2.time_emb_proj.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.conv2.weight: copying a param with shape torch.Size([640, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 3, 3]). size mismatch for up_blocks.1.resnets.2.conv2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.conv_shortcut.weight: copying a param with shape torch.Size([640, 960, 1, 1]) from checkpoint, the shape in current model is torch.Size([1280, 1920, 1, 1]). size mismatch for up_blocks.1.resnets.2.conv_shortcut.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.upsamplers.0.conv.weight: copying a param with shape torch.Size([640, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 3, 3]). size mismatch for up_blocks.1.upsamplers.0.conv.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.2.resnets.0.norm1.weight: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([1920]). size mismatch for up_blocks.2.resnets.0.norm1.bias: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([1920]). size mismatch for up_blocks.2.resnets.0.conv1.weight: copying a param with shape torch.Size([320, 960, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 1920, 3, 3]). size mismatch for up_blocks.2.resnets.0.conv1.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.0.time_emb_proj.weight: copying a param with shape torch.Size([320, 1280]) from checkpoint, the shape in current model is torch.Size([640, 1280]). size mismatch for up_blocks.2.resnets.0.time_emb_proj.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.0.norm2.weight: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.0.norm2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.0.conv2.weight: copying a param with shape torch.Size([320, 320, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 640, 3, 3]). size mismatch for up_blocks.2.resnets.0.conv2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.0.conv_shortcut.weight: copying a param with shape torch.Size([320, 960, 1, 1]) from checkpoint, the shape in current model is torch.Size([640, 1920, 1, 1]). size mismatch for up_blocks.2.resnets.0.conv_shortcut.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.norm1.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.2.resnets.1.norm1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.2.resnets.1.conv1.weight: copying a param with shape torch.Size([320, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 1280, 3, 3]). size mismatch for up_blocks.2.resnets.1.conv1.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.time_emb_proj.weight: copying a param with shape torch.Size([320, 1280]) from checkpoint, the shape in current model is torch.Size([640, 1280]). size mismatch for up_blocks.2.resnets.1.time_emb_proj.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.norm2.weight: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.norm2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.conv2.weight: copying a param with shape torch.Size([320, 320, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 640, 3, 3]). size mismatch for up_blocks.2.resnets.1.conv2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.conv_shortcut.weight: copying a param with shape torch.Size([320, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([640, 1280, 1, 1]). size mismatch for up_blocks.2.resnets.1.conv_shortcut.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.norm1.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([960]). size mismatch for up_blocks.2.resnets.2.norm1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([960]). size mismatch for up_blocks.2.resnets.2.conv1.weight: copying a param with shape torch.Size([320, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 960, 3, 3]). size mismatch for up_blocks.2.resnets.2.conv1.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.time_emb_proj.weight: copying a param with shape torch.Size([320, 1280]) from checkpoint, the shape in current model is torch.Size([640, 1280]). size mismatch for up_blocks.2.resnets.2.time_emb_proj.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.norm2.weight: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.norm2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.conv2.weight: copying a param with shape torch.Size([320, 320, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 640, 3, 3]). size mismatch for up_blocks.2.resnets.2.conv2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.conv_shortcut.weight: copying a param with shape torch.Size([320, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([640, 960, 1, 1]). size mismatch for up_blocks.2.resnets.2.conv_shortcut.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for mid_block.attentions.0.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for mid_block.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for mid_block.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for mid_block.attentions.0.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 996, in main() File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 992, in main launch_command(args) File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 986, in launch_command simple_launcher(args) File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 628, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/bin/python', '/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors', '--train_data_dir=/home/alex/database', '--output_dir=models/loras', '--logging_dir=./logs', '--log_prefix=myimages', '--resolution=512,512', '--network_module=networks.lora', '--max_train_epochs=40', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=32', '--network_alpha=32', '--output_name=myimages', '--train_batch_size=1', '--save_every_n_epochs=10', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=2', '--cache_latents', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--min_bucket_reso=256', '--max_bucket_reso=1584', '--keep_tokens=0', '--xformers', '--shuffle_caption', '--clip_skip=2', '--optimizer_type=AdamW8bit', '--persistent_data_loader_workers', '--log_with=tensorboard']' returned non-zero exit status 1. Train finished Prompt executed in 9.06 seconds ```

LarryJane491 / Lora-Training-in-Comfy

SDXL size mismatch #20