LarryJane491 / Lora-Training-in-Comfy

This custom node lets you train LoRA directly in ComfyUI!
364 stars 50 forks source link

SDXL size mismatch #20

Open JunhaoWang opened 7 months ago

JunhaoWang commented 7 months ago

Just verified, doesn't work with SDXL due to size mismatch

xenogenesi commented 6 months ago

It took me some time but I managed to install everything in Linux/Debian, I was successful in creating a LoRA for dreamshaper (nice work, very simple process), not for juggernaut (SDXL), from the messages it seemed to me for the same reason as this issue. Unfortunately I didn't think about saving the logs, I'll do it again tonight, I'll save the logs and then attach them here.

xenogenesi commented 6 months ago

Do I need to use any specific resolution to train for SDXL? 1024x1024? After seeing the error with juggernaut I had scaled all the images to 512x512 and it worked with dreamshaper_8, but I think it wasn't necessary, they should be scaled automatically right?

ComfyUI log with juggernaut ``` ** ComfyUI startup time: 2024-03-08 19:42:26.623486 ** Platform: Linux ** Python version: 3.11.8 (main, Mar 3 2024, 09:23:40) [GCC 13.2.0] ** Python executable: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/bin/python ** Log path: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/comfyui.log Prestartup times for custom nodes: 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI-Manager Total VRAM 12042 MB, total RAM 32019 MB xformers version: 0.0.23.post1 Set vram state to: NORMAL_VRAM Device: cuda:0 NVIDIA GeForce RTX 3060 : cudaMallocAsync VAE dtype: torch.bfloat16 Using xformers cross attention [comfyui_controlnet_aux] | INFO -> Using ckpts path: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts [comfyui_controlnet_aux] | INFO -> Using symlinks: False [comfyui_controlnet_aux] | INFO -> Using ort providers: ['CUDAExecutionProvider', 'DirectMLExecutionProvider', 'OpenVINOExecutionProvider', 'ROCMExecutionProvider', 'CPUExecutionProvider', 'CoreMLExecutionProvider'] DWPose: Onnxruntime with acceleration providers detected ### Loading: ComfyUI-Manager (V2.9) ### ComfyUI Revision: 2052 [55f37baa] | Released on '2024-03-07' Import times for custom nodes: 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Image-Captioning-in-ComfyUI 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI-WD14-Tagger 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/comfyui-tooling-nodes 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/comfyui-inpaint-nodes 0.0 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI_UltimateSDUpscale 0.1 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI-Manager 0.7 seconds: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/comfyui_controlnet_aux Starting server To see the GUI go to: http://127.0.0.1:8188 [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json FETCH DATA from: /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/ComfyUI-Manager/extension-node-map.json got prompt [] /home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py The following values were not passed to `accelerate launch` and had defaults used instead: `--num_processes` was set to a value of `1` `--num_machines` was set to a value of `1` `--mixed_precision` was set to a value of `'no'` `--dynamo_backend` was set to a value of `'no'` To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`. prepare tokenizer update token length: 225 Using DreamBooth method. prepare images. found directory /home/alex/database/5_myimages contains 16 image files 80 train images with repeating. 0 reg images. no regularization images / 正則化画像が見つかりませんでした [Dataset 0] batch_size: 1 resolution: (512, 512) enable_bucket: True min_bucket_reso: 256 max_bucket_reso: 1584 bucket_reso_steps: 64 bucket_no_upscale: False [Subset 0 of Dataset 0] image_dir: "/home/alex/database/5_myimages" image_count: 16 num_repeats: 5 shuffle_caption: True keep_tokens: 0 caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 caption_prefix: None caption_suffix: None color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: myimages caption_extension: .txt [Dataset 0] loading image sizes. 0%| | 0/16 [00:00 trainer.train(args) File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py", line 228, in train model_version, text_encoder, vae, unet = self.load_target_model(args, weight_dtype, accelerator) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py", line 102, in load_target_model text_encoder, vae, unet, _ = train_util.load_target_model(args, weight_dtype, accelerator) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/library/train_util.py", line 3917, in load_target_model text_encoder, vae, unet, load_stable_diffusion_format = _load_target_model( ^^^^^^^^^^^^^^^^^^^ File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/library/train_util.py", line 3860, in _load_target_model text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/library/model_util.py", line 1007, in load_models_from_stable_diffusion_checkpoint info = unet.load_state_dict(converted_unet_checkpoint) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for UNet2DConditionModel: Missing key(s) in state_dict: "down_blocks.0.attentions.0.norm.weight", "down_blocks.0.attentions.0.norm.bias", "down_blocks.0.attentions.0.proj_in.weight", "down_blocks.0.attentions.0.proj_in.bias", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_q.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_k.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_v.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.bias", "down_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.weight", "down_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.bias", "down_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.weight", "down_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.bias", "down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_q.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.bias", "down_blocks.0.attentions.0.transformer_blocks.0.norm1.weight", "down_blocks.0.attentions.0.transformer_blocks.0.norm1.bias", "down_blocks.0.attentions.0.transformer_blocks.0.norm2.weight", "down_blocks.0.attentions.0.transformer_blocks.0.norm2.bias", "down_blocks.0.attentions.0.transformer_blocks.0.norm3.weight", "down_blocks.0.attentions.0.transformer_blocks.0.norm3.bias", "down_blocks.0.attentions.0.proj_out.weight", "down_blocks.0.attentions.0.proj_out.bias", "down_blocks.0.attentions.1.norm.weight", "down_blocks.0.attentions.1.norm.bias", "down_blocks.0.attentions.1.proj_in.weight", "down_blocks.0.attentions.1.proj_in.bias", "down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_q.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_k.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_v.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.bias", "down_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.weight", "down_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.bias", "down_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.weight", "down_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.bias", "down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_q.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_k.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_v.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.weight", "down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.bias", "down_blocks.0.attentions.1.transformer_blocks.0.norm1.weight", "down_blocks.0.attentions.1.transformer_blocks.0.norm1.bias", "down_blocks.0.attentions.1.transformer_blocks.0.norm2.weight", "down_blocks.0.attentions.1.transformer_blocks.0.norm2.bias", "down_blocks.0.attentions.1.transformer_blocks.0.norm3.weight", "down_blocks.0.attentions.1.transformer_blocks.0.norm3.bias", "down_blocks.0.attentions.1.proj_out.weight", "down_blocks.0.attentions.1.proj_out.bias", "down_blocks.2.downsamplers.0.conv.weight", "down_blocks.2.downsamplers.0.conv.bias", "down_blocks.3.resnets.0.norm1.weight", "down_blocks.3.resnets.0.norm1.bias", "down_blocks.3.resnets.0.conv1.weight", "down_blocks.3.resnets.0.conv1.bias", "down_blocks.3.resnets.0.time_emb_proj.weight", "down_blocks.3.resnets.0.time_emb_proj.bias", "down_blocks.3.resnets.0.norm2.weight", "down_blocks.3.resnets.0.norm2.bias", "down_blocks.3.resnets.0.conv2.weight", "down_blocks.3.resnets.0.conv2.bias", "down_blocks.3.resnets.1.norm1.weight", "down_blocks.3.resnets.1.norm1.bias", "down_blocks.3.resnets.1.conv1.weight", "down_blocks.3.resnets.1.conv1.bias", "down_blocks.3.resnets.1.time_emb_proj.weight", "down_blocks.3.resnets.1.time_emb_proj.bias", "down_blocks.3.resnets.1.norm2.weight", "down_blocks.3.resnets.1.norm2.bias", "down_blocks.3.resnets.1.conv2.weight", "down_blocks.3.resnets.1.conv2.bias", "up_blocks.2.attentions.0.norm.weight", "up_blocks.2.attentions.0.norm.bias", "up_blocks.2.attentions.0.proj_in.weight", "up_blocks.2.attentions.0.proj_in.bias", "up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_q.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_k.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_v.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.weight", "up_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.bias", "up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_q.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.2.attentions.0.transformer_blocks.0.norm1.weight", "up_blocks.2.attentions.0.transformer_blocks.0.norm1.bias", "up_blocks.2.attentions.0.transformer_blocks.0.norm2.weight", "up_blocks.2.attentions.0.transformer_blocks.0.norm2.bias", "up_blocks.2.attentions.0.transformer_blocks.0.norm3.weight", "up_blocks.2.attentions.0.transformer_blocks.0.norm3.bias", "up_blocks.2.attentions.0.proj_out.weight", "up_blocks.2.attentions.0.proj_out.bias", "up_blocks.2.attentions.1.norm.weight", "up_blocks.2.attentions.1.norm.bias", "up_blocks.2.attentions.1.proj_in.weight", "up_blocks.2.attentions.1.proj_in.bias", "up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_q.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_k.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_v.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.weight", "up_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.bias", "up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_q.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.2.attentions.1.transformer_blocks.0.norm1.weight", "up_blocks.2.attentions.1.transformer_blocks.0.norm1.bias", "up_blocks.2.attentions.1.transformer_blocks.0.norm2.weight", "up_blocks.2.attentions.1.transformer_blocks.0.norm2.bias", "up_blocks.2.attentions.1.transformer_blocks.0.norm3.weight", "up_blocks.2.attentions.1.transformer_blocks.0.norm3.bias", "up_blocks.2.attentions.1.proj_out.weight", "up_blocks.2.attentions.1.proj_out.bias", "up_blocks.2.attentions.2.norm.weight", "up_blocks.2.attentions.2.norm.bias", "up_blocks.2.attentions.2.proj_in.weight", "up_blocks.2.attentions.2.proj_in.bias", "up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_q.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_k.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_v.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.2.attentions.2.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.2.attentions.2.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.2.attentions.2.transformer_blocks.0.ff.net.2.weight", "up_blocks.2.attentions.2.transformer_blocks.0.ff.net.2.bias", "up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_q.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_k.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_v.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.2.attentions.2.transformer_blocks.0.norm1.weight", "up_blocks.2.attentions.2.transformer_blocks.0.norm1.bias", "up_blocks.2.attentions.2.transformer_blocks.0.norm2.weight", "up_blocks.2.attentions.2.transformer_blocks.0.norm2.bias", "up_blocks.2.attentions.2.transformer_blocks.0.norm3.weight", "up_blocks.2.attentions.2.transformer_blocks.0.norm3.bias", "up_blocks.2.attentions.2.proj_out.weight", "up_blocks.2.attentions.2.proj_out.bias", "up_blocks.2.upsamplers.0.conv.weight", "up_blocks.2.upsamplers.0.conv.bias", "up_blocks.3.attentions.0.norm.weight", "up_blocks.3.attentions.0.norm.bias", "up_blocks.3.attentions.0.proj_in.weight", "up_blocks.3.attentions.0.proj_in.bias", "up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_q.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_k.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_v.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.3.attentions.0.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.3.attentions.0.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.3.attentions.0.transformer_blocks.0.ff.net.2.weight", "up_blocks.3.attentions.0.transformer_blocks.0.ff.net.2.bias", "up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_q.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_k.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_v.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.3.attentions.0.transformer_blocks.0.norm1.weight", "up_blocks.3.attentions.0.transformer_blocks.0.norm1.bias", "up_blocks.3.attentions.0.transformer_blocks.0.norm2.weight", "up_blocks.3.attentions.0.transformer_blocks.0.norm2.bias", "up_blocks.3.attentions.0.transformer_blocks.0.norm3.weight", "up_blocks.3.attentions.0.transformer_blocks.0.norm3.bias", "up_blocks.3.attentions.0.proj_out.weight", "up_blocks.3.attentions.0.proj_out.bias", "up_blocks.3.attentions.1.norm.weight", "up_blocks.3.attentions.1.norm.bias", "up_blocks.3.attentions.1.proj_in.weight", "up_blocks.3.attentions.1.proj_in.bias", "up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_q.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_k.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_v.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.3.attentions.1.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.3.attentions.1.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.3.attentions.1.transformer_blocks.0.ff.net.2.weight", "up_blocks.3.attentions.1.transformer_blocks.0.ff.net.2.bias", "up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_q.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_k.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_v.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.3.attentions.1.transformer_blocks.0.norm1.weight", "up_blocks.3.attentions.1.transformer_blocks.0.norm1.bias", "up_blocks.3.attentions.1.transformer_blocks.0.norm2.weight", "up_blocks.3.attentions.1.transformer_blocks.0.norm2.bias", "up_blocks.3.attentions.1.transformer_blocks.0.norm3.weight", "up_blocks.3.attentions.1.transformer_blocks.0.norm3.bias", "up_blocks.3.attentions.1.proj_out.weight", "up_blocks.3.attentions.1.proj_out.bias", "up_blocks.3.attentions.2.norm.weight", "up_blocks.3.attentions.2.norm.bias", "up_blocks.3.attentions.2.proj_in.weight", "up_blocks.3.attentions.2.proj_in.bias", "up_blocks.3.attentions.2.transformer_blocks.0.attn1.to_q.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn1.to_k.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn1.to_v.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.3.attentions.2.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.3.attentions.2.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.3.attentions.2.transformer_blocks.0.ff.net.2.weight", "up_blocks.3.attentions.2.transformer_blocks.0.ff.net.2.bias", "up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_q.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_k.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_v.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.3.attentions.2.transformer_blocks.0.norm1.weight", "up_blocks.3.attentions.2.transformer_blocks.0.norm1.bias", "up_blocks.3.attentions.2.transformer_blocks.0.norm2.weight", "up_blocks.3.attentions.2.transformer_blocks.0.norm2.bias", "up_blocks.3.attentions.2.transformer_blocks.0.norm3.weight", "up_blocks.3.attentions.2.transformer_blocks.0.norm3.bias", "up_blocks.3.attentions.2.proj_out.weight", "up_blocks.3.attentions.2.proj_out.bias", "up_blocks.3.resnets.0.norm1.weight", "up_blocks.3.resnets.0.norm1.bias", "up_blocks.3.resnets.0.conv1.weight", "up_blocks.3.resnets.0.conv1.bias", "up_blocks.3.resnets.0.time_emb_proj.weight", "up_blocks.3.resnets.0.time_emb_proj.bias", "up_blocks.3.resnets.0.norm2.weight", "up_blocks.3.resnets.0.norm2.bias", "up_blocks.3.resnets.0.conv2.weight", "up_blocks.3.resnets.0.conv2.bias", "up_blocks.3.resnets.0.conv_shortcut.weight", "up_blocks.3.resnets.0.conv_shortcut.bias", "up_blocks.3.resnets.1.norm1.weight", "up_blocks.3.resnets.1.norm1.bias", "up_blocks.3.resnets.1.conv1.weight", "up_blocks.3.resnets.1.conv1.bias", "up_blocks.3.resnets.1.time_emb_proj.weight", "up_blocks.3.resnets.1.time_emb_proj.bias", "up_blocks.3.resnets.1.norm2.weight", "up_blocks.3.resnets.1.norm2.bias", "up_blocks.3.resnets.1.conv2.weight", "up_blocks.3.resnets.1.conv2.bias", "up_blocks.3.resnets.1.conv_shortcut.weight", "up_blocks.3.resnets.1.conv_shortcut.bias", "up_blocks.3.resnets.2.norm1.weight", "up_blocks.3.resnets.2.norm1.bias", "up_blocks.3.resnets.2.conv1.weight", "up_blocks.3.resnets.2.conv1.bias", "up_blocks.3.resnets.2.time_emb_proj.weight", "up_blocks.3.resnets.2.time_emb_proj.bias", "up_blocks.3.resnets.2.norm2.weight", "up_blocks.3.resnets.2.norm2.bias", "up_blocks.3.resnets.2.conv2.weight", "up_blocks.3.resnets.2.conv2.bias", "up_blocks.3.resnets.2.conv_shortcut.weight", "up_blocks.3.resnets.2.conv_shortcut.bias". Unexpected key(s) in state_dict: "down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_k.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.bias", "down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_q.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_v.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_k.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.bias", "down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_q.weight", "down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_v.weight", "down_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.bias", "down_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.weight", "down_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.bias", "down_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.weight", "down_blocks.1.attentions.0.transformer_blocks.1.norm1.bias", "down_blocks.1.attentions.0.transformer_blocks.1.norm1.weight", "down_blocks.1.attentions.0.transformer_blocks.1.norm2.bias", "down_blocks.1.attentions.0.transformer_blocks.1.norm2.weight", "down_blocks.1.attentions.0.transformer_blocks.1.norm3.bias", "down_blocks.1.attentions.0.transformer_blocks.1.norm3.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_k.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.bias", "down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_q.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_v.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_k.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.bias", "down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_q.weight", "down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_v.weight", "down_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.bias", "down_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.weight", "down_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.bias", "down_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.weight", "down_blocks.1.attentions.1.transformer_blocks.1.norm1.bias", "down_blocks.1.attentions.1.transformer_blocks.1.norm1.weight", "down_blocks.1.attentions.1.transformer_blocks.1.norm2.bias", "down_blocks.1.attentions.1.transformer_blocks.1.norm2.weight", "down_blocks.1.attentions.1.transformer_blocks.1.norm3.bias", "down_blocks.1.attentions.1.transformer_blocks.1.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.1.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.1.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.1.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.1.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.1.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.1.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.1.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.1.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.1.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.1.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.2.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.2.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.2.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.2.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.2.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.2.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.2.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.2.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.2.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.2.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.3.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.3.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.3.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.3.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.3.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.3.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.3.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.3.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.3.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.3.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.4.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.4.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.4.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.4.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.4.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.4.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.4.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.4.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.4.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.4.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.5.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.5.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.5.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.5.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.5.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.5.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.5.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.5.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.5.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.5.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.6.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.6.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.6.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.6.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.6.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.6.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.6.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.6.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.6.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.6.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.7.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.7.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.7.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.7.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.7.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.7.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.7.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.7.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.7.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.7.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.8.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.8.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.8.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.8.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.8.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.8.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.8.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.8.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.8.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.8.norm3.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_k.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_out.0.bias", "down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_out.0.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_q.weight", "down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_v.weight", "down_blocks.2.attentions.0.transformer_blocks.9.ff.net.0.proj.bias", "down_blocks.2.attentions.0.transformer_blocks.9.ff.net.0.proj.weight", "down_blocks.2.attentions.0.transformer_blocks.9.ff.net.2.bias", "down_blocks.2.attentions.0.transformer_blocks.9.ff.net.2.weight", "down_blocks.2.attentions.0.transformer_blocks.9.norm1.bias", "down_blocks.2.attentions.0.transformer_blocks.9.norm1.weight", "down_blocks.2.attentions.0.transformer_blocks.9.norm2.bias", "down_blocks.2.attentions.0.transformer_blocks.9.norm2.weight", "down_blocks.2.attentions.0.transformer_blocks.9.norm3.bias", "down_blocks.2.attentions.0.transformer_blocks.9.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.1.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.1.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.1.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.1.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.1.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.1.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.1.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.1.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.1.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.1.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.1.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.1.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.1.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.2.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.2.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.2.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.2.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.2.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.2.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.2.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.2.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.2.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.2.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.2.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.2.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.2.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.3.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.3.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.3.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.3.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.3.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.3.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.3.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.3.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.3.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.3.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.3.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.3.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.3.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.4.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.4.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.4.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.4.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.4.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.4.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.4.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.4.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.4.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.4.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.4.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.4.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.4.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.5.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.5.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.5.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.5.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.5.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.5.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.5.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.5.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.5.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.5.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.5.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.5.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.5.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.6.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.6.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.6.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.6.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.6.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.6.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.6.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.6.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.6.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.6.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.6.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.6.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.6.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.7.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.7.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.7.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.7.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.7.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.7.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.7.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.7.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.7.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.7.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.7.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.7.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.7.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.8.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.8.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.8.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.8.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.8.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.8.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.8.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.8.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.8.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.8.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.8.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.8.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.8.norm3.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn1.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn1.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.9.attn1.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn1.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn1.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn2.to_k.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn2.to_out.0.bias", "down_blocks.2.attentions.1.transformer_blocks.9.attn2.to_out.0.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn2.to_q.weight", "down_blocks.2.attentions.1.transformer_blocks.9.attn2.to_v.weight", "down_blocks.2.attentions.1.transformer_blocks.9.ff.net.0.proj.bias", "down_blocks.2.attentions.1.transformer_blocks.9.ff.net.0.proj.weight", "down_blocks.2.attentions.1.transformer_blocks.9.ff.net.2.bias", "down_blocks.2.attentions.1.transformer_blocks.9.ff.net.2.weight", "down_blocks.2.attentions.1.transformer_blocks.9.norm1.bias", "down_blocks.2.attentions.1.transformer_blocks.9.norm1.weight", "down_blocks.2.attentions.1.transformer_blocks.9.norm2.bias", "down_blocks.2.attentions.1.transformer_blocks.9.norm2.weight", "down_blocks.2.attentions.1.transformer_blocks.9.norm3.bias", "down_blocks.2.attentions.1.transformer_blocks.9.norm3.weight", "up_blocks.0.attentions.0.norm.bias", "up_blocks.0.attentions.0.norm.weight", "up_blocks.0.attentions.0.proj_in.bias", "up_blocks.0.attentions.0.proj_in.weight", "up_blocks.0.attentions.0.proj_out.bias", "up_blocks.0.attentions.0.proj_out.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.0.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.0.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.0.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.0.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.0.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.0.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.1.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.1.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.1.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.1.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.1.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.1.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.1.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.1.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.1.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.2.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.2.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.2.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.2.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.2.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.2.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.2.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.2.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.2.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.2.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.2.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.2.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.3.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.3.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.3.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.3.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.3.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.3.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.3.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.3.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.3.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.3.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.3.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.3.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.3.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.4.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.4.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.4.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.4.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.4.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.4.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.4.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.4.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.4.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.4.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.4.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.4.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.4.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.5.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.5.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.5.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.5.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.5.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.5.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.5.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.5.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.5.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.5.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.5.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.5.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.5.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.6.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.6.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.6.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.6.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.6.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.6.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.6.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.6.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.6.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.6.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.6.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.6.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.6.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.7.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.7.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.7.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.7.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.7.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.7.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.7.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.7.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.7.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.7.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.7.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.7.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.7.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.8.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.8.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.8.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.8.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.8.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.8.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.8.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.8.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.8.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.8.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.8.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.8.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.8.norm3.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn1.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn1.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.9.attn1.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn1.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn1.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn2.to_k.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn2.to_out.0.bias", "up_blocks.0.attentions.0.transformer_blocks.9.attn2.to_out.0.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn2.to_q.weight", "up_blocks.0.attentions.0.transformer_blocks.9.attn2.to_v.weight", "up_blocks.0.attentions.0.transformer_blocks.9.ff.net.0.proj.bias", "up_blocks.0.attentions.0.transformer_blocks.9.ff.net.0.proj.weight", "up_blocks.0.attentions.0.transformer_blocks.9.ff.net.2.bias", "up_blocks.0.attentions.0.transformer_blocks.9.ff.net.2.weight", "up_blocks.0.attentions.0.transformer_blocks.9.norm1.bias", "up_blocks.0.attentions.0.transformer_blocks.9.norm1.weight", "up_blocks.0.attentions.0.transformer_blocks.9.norm2.bias", "up_blocks.0.attentions.0.transformer_blocks.9.norm2.weight", "up_blocks.0.attentions.0.transformer_blocks.9.norm3.bias", "up_blocks.0.attentions.0.transformer_blocks.9.norm3.weight", "up_blocks.0.attentions.1.norm.bias", "up_blocks.0.attentions.1.norm.weight", "up_blocks.0.attentions.1.proj_in.bias", "up_blocks.0.attentions.1.proj_in.weight", "up_blocks.0.attentions.1.proj_out.bias", "up_blocks.0.attentions.1.proj_out.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.0.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.0.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.0.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.0.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.0.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.0.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.0.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.1.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.1.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.1.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.1.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.1.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.1.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.1.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.1.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.1.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.2.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.2.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.2.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.2.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.2.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.2.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.2.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.2.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.2.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.2.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.2.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.3.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.3.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.3.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.3.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.3.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.3.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.3.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.3.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.3.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.3.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.3.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.3.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.3.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.4.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.4.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.4.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.4.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.4.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.4.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.4.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.4.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.4.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.4.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.4.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.4.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.4.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.5.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.5.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.5.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.5.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.5.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.5.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.5.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.5.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.5.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.5.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.5.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.5.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.5.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.6.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.6.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.6.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.6.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.6.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.6.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.6.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.6.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.6.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.6.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.6.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.6.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.6.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.7.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.7.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.7.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.7.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.7.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.7.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.7.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.7.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.7.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.7.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.7.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.7.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.7.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.8.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.8.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.8.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.8.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.8.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.8.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.8.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.8.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.8.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.8.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.8.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.8.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.8.norm3.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn1.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn1.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.9.attn1.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn1.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn1.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn2.to_k.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn2.to_out.0.bias", "up_blocks.0.attentions.1.transformer_blocks.9.attn2.to_out.0.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn2.to_q.weight", "up_blocks.0.attentions.1.transformer_blocks.9.attn2.to_v.weight", "up_blocks.0.attentions.1.transformer_blocks.9.ff.net.0.proj.bias", "up_blocks.0.attentions.1.transformer_blocks.9.ff.net.0.proj.weight", "up_blocks.0.attentions.1.transformer_blocks.9.ff.net.2.bias", "up_blocks.0.attentions.1.transformer_blocks.9.ff.net.2.weight", "up_blocks.0.attentions.1.transformer_blocks.9.norm1.bias", "up_blocks.0.attentions.1.transformer_blocks.9.norm1.weight", "up_blocks.0.attentions.1.transformer_blocks.9.norm2.bias", "up_blocks.0.attentions.1.transformer_blocks.9.norm2.weight", "up_blocks.0.attentions.1.transformer_blocks.9.norm3.bias", "up_blocks.0.attentions.1.transformer_blocks.9.norm3.weight", "up_blocks.0.attentions.2.norm.bias", "up_blocks.0.attentions.2.norm.weight", "up_blocks.0.attentions.2.proj_in.bias", "up_blocks.0.attentions.2.proj_in.weight", "up_blocks.0.attentions.2.proj_out.bias", "up_blocks.0.attentions.2.proj_out.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.0.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.0.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.0.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.0.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.0.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.0.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.0.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.0.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.0.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.0.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.0.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.0.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.1.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.1.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.1.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.1.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.1.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.1.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.1.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.1.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.1.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.2.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.2.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.2.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.2.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.2.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.2.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.2.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.2.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.2.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.2.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.2.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.2.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.2.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.3.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.3.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.3.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.3.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.3.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.3.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.3.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.3.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.3.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.3.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.3.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.4.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.4.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.4.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.4.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.4.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.4.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.4.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.4.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.4.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.4.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.4.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.4.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.4.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.5.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.5.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.5.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.5.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.5.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.5.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.5.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.5.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.5.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.5.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.5.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.5.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.5.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.6.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.6.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.6.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.6.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.6.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.6.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.6.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.6.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.6.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.6.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.6.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.6.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.7.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.7.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.7.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.7.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.7.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.7.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.7.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.7.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.7.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.7.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.7.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.7.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.7.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.8.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.8.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.8.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.8.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.8.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.8.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.8.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.8.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.8.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.8.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.8.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.8.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.8.norm3.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn1.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn1.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.9.attn1.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn1.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn1.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn2.to_k.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn2.to_out.0.bias", "up_blocks.0.attentions.2.transformer_blocks.9.attn2.to_out.0.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn2.to_q.weight", "up_blocks.0.attentions.2.transformer_blocks.9.attn2.to_v.weight", "up_blocks.0.attentions.2.transformer_blocks.9.ff.net.0.proj.bias", "up_blocks.0.attentions.2.transformer_blocks.9.ff.net.0.proj.weight", "up_blocks.0.attentions.2.transformer_blocks.9.ff.net.2.bias", "up_blocks.0.attentions.2.transformer_blocks.9.ff.net.2.weight", "up_blocks.0.attentions.2.transformer_blocks.9.norm1.bias", "up_blocks.0.attentions.2.transformer_blocks.9.norm1.weight", "up_blocks.0.attentions.2.transformer_blocks.9.norm2.bias", "up_blocks.0.attentions.2.transformer_blocks.9.norm2.weight", "up_blocks.0.attentions.2.transformer_blocks.9.norm3.bias", "up_blocks.0.attentions.2.transformer_blocks.9.norm3.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn1.to_k.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn1.to_q.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn1.to_v.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn2.to_k.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn2.to_q.weight", "up_blocks.1.attentions.0.transformer_blocks.1.attn2.to_v.weight", "up_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.bias", "up_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.weight", "up_blocks.1.attentions.0.transformer_blocks.1.norm1.bias", "up_blocks.1.attentions.0.transformer_blocks.1.norm1.weight", "up_blocks.1.attentions.0.transformer_blocks.1.norm2.bias", "up_blocks.1.attentions.0.transformer_blocks.1.norm2.weight", "up_blocks.1.attentions.0.transformer_blocks.1.norm3.bias", "up_blocks.1.attentions.0.transformer_blocks.1.norm3.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn1.to_k.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn1.to_q.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn1.to_v.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn2.to_k.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn2.to_q.weight", "up_blocks.1.attentions.1.transformer_blocks.1.attn2.to_v.weight", "up_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.bias", "up_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.weight", "up_blocks.1.attentions.1.transformer_blocks.1.norm1.bias", "up_blocks.1.attentions.1.transformer_blocks.1.norm1.weight", "up_blocks.1.attentions.1.transformer_blocks.1.norm2.bias", "up_blocks.1.attentions.1.transformer_blocks.1.norm2.weight", "up_blocks.1.attentions.1.transformer_blocks.1.norm3.bias", "up_blocks.1.attentions.1.transformer_blocks.1.norm3.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn1.to_k.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn1.to_out.0.bias", "up_blocks.1.attentions.2.transformer_blocks.1.attn1.to_out.0.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn1.to_q.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn1.to_v.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn2.to_k.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn2.to_out.0.bias", "up_blocks.1.attentions.2.transformer_blocks.1.attn2.to_out.0.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn2.to_q.weight", "up_blocks.1.attentions.2.transformer_blocks.1.attn2.to_v.weight", "up_blocks.1.attentions.2.transformer_blocks.1.ff.net.0.proj.bias", "up_blocks.1.attentions.2.transformer_blocks.1.ff.net.0.proj.weight", "up_blocks.1.attentions.2.transformer_blocks.1.ff.net.2.bias", "up_blocks.1.attentions.2.transformer_blocks.1.ff.net.2.weight", "up_blocks.1.attentions.2.transformer_blocks.1.norm1.bias", "up_blocks.1.attentions.2.transformer_blocks.1.norm1.weight", "up_blocks.1.attentions.2.transformer_blocks.1.norm2.bias", "up_blocks.1.attentions.2.transformer_blocks.1.norm2.weight", "up_blocks.1.attentions.2.transformer_blocks.1.norm3.bias", "up_blocks.1.attentions.2.transformer_blocks.1.norm3.weight", "mid_block.attentions.0.transformer_blocks.1.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.1.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.1.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.1.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.1.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.1.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.1.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.1.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.1.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.1.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.1.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.1.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.1.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.1.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.1.norm1.bias", "mid_block.attentions.0.transformer_blocks.1.norm1.weight", "mid_block.attentions.0.transformer_blocks.1.norm2.bias", "mid_block.attentions.0.transformer_blocks.1.norm2.weight", "mid_block.attentions.0.transformer_blocks.1.norm3.bias", "mid_block.attentions.0.transformer_blocks.1.norm3.weight", "mid_block.attentions.0.transformer_blocks.2.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.2.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.2.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.2.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.2.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.2.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.2.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.2.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.2.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.2.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.2.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.2.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.2.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.2.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.2.norm1.bias", "mid_block.attentions.0.transformer_blocks.2.norm1.weight", "mid_block.attentions.0.transformer_blocks.2.norm2.bias", "mid_block.attentions.0.transformer_blocks.2.norm2.weight", "mid_block.attentions.0.transformer_blocks.2.norm3.bias", "mid_block.attentions.0.transformer_blocks.2.norm3.weight", "mid_block.attentions.0.transformer_blocks.3.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.3.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.3.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.3.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.3.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.3.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.3.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.3.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.3.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.3.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.3.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.3.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.3.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.3.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.3.norm1.bias", "mid_block.attentions.0.transformer_blocks.3.norm1.weight", "mid_block.attentions.0.transformer_blocks.3.norm2.bias", "mid_block.attentions.0.transformer_blocks.3.norm2.weight", "mid_block.attentions.0.transformer_blocks.3.norm3.bias", "mid_block.attentions.0.transformer_blocks.3.norm3.weight", "mid_block.attentions.0.transformer_blocks.4.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.4.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.4.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.4.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.4.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.4.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.4.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.4.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.4.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.4.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.4.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.4.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.4.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.4.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.4.norm1.bias", "mid_block.attentions.0.transformer_blocks.4.norm1.weight", "mid_block.attentions.0.transformer_blocks.4.norm2.bias", "mid_block.attentions.0.transformer_blocks.4.norm2.weight", "mid_block.attentions.0.transformer_blocks.4.norm3.bias", "mid_block.attentions.0.transformer_blocks.4.norm3.weight", "mid_block.attentions.0.transformer_blocks.5.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.5.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.5.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.5.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.5.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.5.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.5.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.5.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.5.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.5.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.5.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.5.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.5.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.5.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.5.norm1.bias", "mid_block.attentions.0.transformer_blocks.5.norm1.weight", "mid_block.attentions.0.transformer_blocks.5.norm2.bias", "mid_block.attentions.0.transformer_blocks.5.norm2.weight", "mid_block.attentions.0.transformer_blocks.5.norm3.bias", "mid_block.attentions.0.transformer_blocks.5.norm3.weight", "mid_block.attentions.0.transformer_blocks.6.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.6.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.6.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.6.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.6.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.6.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.6.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.6.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.6.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.6.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.6.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.6.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.6.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.6.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.6.norm1.bias", "mid_block.attentions.0.transformer_blocks.6.norm1.weight", "mid_block.attentions.0.transformer_blocks.6.norm2.bias", "mid_block.attentions.0.transformer_blocks.6.norm2.weight", "mid_block.attentions.0.transformer_blocks.6.norm3.bias", "mid_block.attentions.0.transformer_blocks.6.norm3.weight", "mid_block.attentions.0.transformer_blocks.7.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.7.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.7.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.7.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.7.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.7.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.7.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.7.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.7.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.7.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.7.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.7.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.7.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.7.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.7.norm1.bias", "mid_block.attentions.0.transformer_blocks.7.norm1.weight", "mid_block.attentions.0.transformer_blocks.7.norm2.bias", "mid_block.attentions.0.transformer_blocks.7.norm2.weight", "mid_block.attentions.0.transformer_blocks.7.norm3.bias", "mid_block.attentions.0.transformer_blocks.7.norm3.weight", "mid_block.attentions.0.transformer_blocks.8.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.8.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.8.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.8.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.8.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.8.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.8.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.8.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.8.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.8.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.8.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.8.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.8.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.8.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.8.norm1.bias", "mid_block.attentions.0.transformer_blocks.8.norm1.weight", "mid_block.attentions.0.transformer_blocks.8.norm2.bias", "mid_block.attentions.0.transformer_blocks.8.norm2.weight", "mid_block.attentions.0.transformer_blocks.8.norm3.bias", "mid_block.attentions.0.transformer_blocks.8.norm3.weight", "mid_block.attentions.0.transformer_blocks.9.attn1.to_k.weight", "mid_block.attentions.0.transformer_blocks.9.attn1.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.9.attn1.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.9.attn1.to_q.weight", "mid_block.attentions.0.transformer_blocks.9.attn1.to_v.weight", "mid_block.attentions.0.transformer_blocks.9.attn2.to_k.weight", "mid_block.attentions.0.transformer_blocks.9.attn2.to_out.0.bias", "mid_block.attentions.0.transformer_blocks.9.attn2.to_out.0.weight", "mid_block.attentions.0.transformer_blocks.9.attn2.to_q.weight", "mid_block.attentions.0.transformer_blocks.9.attn2.to_v.weight", "mid_block.attentions.0.transformer_blocks.9.ff.net.0.proj.bias", "mid_block.attentions.0.transformer_blocks.9.ff.net.0.proj.weight", "mid_block.attentions.0.transformer_blocks.9.ff.net.2.bias", "mid_block.attentions.0.transformer_blocks.9.ff.net.2.weight", "mid_block.attentions.0.transformer_blocks.9.norm1.bias", "mid_block.attentions.0.transformer_blocks.9.norm1.weight", "mid_block.attentions.0.transformer_blocks.9.norm2.bias", "mid_block.attentions.0.transformer_blocks.9.norm2.weight", "mid_block.attentions.0.transformer_blocks.9.norm3.bias", "mid_block.attentions.0.transformer_blocks.9.norm3.weight". size mismatch for down_blocks.1.attentions.0.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for down_blocks.1.attentions.0.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for down_blocks.1.attentions.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for down_blocks.1.attentions.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for down_blocks.2.attentions.0.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for down_blocks.2.attentions.0.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for down_blocks.2.attentions.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for down_blocks.2.attentions.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.0.resnets.2.norm1.weight: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.0.resnets.2.norm1.bias: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.0.resnets.2.conv1.weight: copying a param with shape torch.Size([1280, 1920, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 3, 3]). size mismatch for up_blocks.0.resnets.2.conv_shortcut.weight: copying a param with shape torch.Size([1280, 1920, 1, 1]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 1, 1]). size mismatch for up_blocks.1.attentions.0.norm.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.norm.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.0.proj_in.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_k.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_v.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.weight: copying a param with shape torch.Size([5120, 640]) from checkpoint, the shape in current model is torch.Size([10240, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.bias: copying a param with shape torch.Size([5120]) from checkpoint, the shape in current model is torch.Size([10240]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.weight: copying a param with shape torch.Size([640, 2560]) from checkpoint, the shape in current model is torch.Size([1280, 5120]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm1.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm3.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.norm3.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.0.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.0.proj_out.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.norm.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.norm.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.1.proj_in.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_k.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_v.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.weight: copying a param with shape torch.Size([5120, 640]) from checkpoint, the shape in current model is torch.Size([10240, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.bias: copying a param with shape torch.Size([5120]) from checkpoint, the shape in current model is torch.Size([10240]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.weight: copying a param with shape torch.Size([640, 2560]) from checkpoint, the shape in current model is torch.Size([1280, 5120]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm1.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm3.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.norm3.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.1.proj_out.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.norm.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.norm.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.2.proj_in.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_k.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_v.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.ff.net.0.proj.weight: copying a param with shape torch.Size([5120, 640]) from checkpoint, the shape in current model is torch.Size([10240, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.ff.net.0.proj.bias: copying a param with shape torch.Size([5120]) from checkpoint, the shape in current model is torch.Size([10240]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.ff.net.2.weight: copying a param with shape torch.Size([640, 2560]) from checkpoint, the shape in current model is torch.Size([1280, 5120]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.ff.net.2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_q.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_out.0.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_out.0.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm1.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm3.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.norm3.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.attentions.2.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for up_blocks.1.attentions.2.proj_out.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.norm1.weight: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.1.resnets.0.norm1.bias: copying a param with shape torch.Size([1920]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.1.resnets.0.conv1.weight: copying a param with shape torch.Size([640, 1920, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 3, 3]). size mismatch for up_blocks.1.resnets.0.conv1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.time_emb_proj.weight: copying a param with shape torch.Size([640, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.resnets.0.time_emb_proj.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.conv2.weight: copying a param with shape torch.Size([640, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 3, 3]). size mismatch for up_blocks.1.resnets.0.conv2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.0.conv_shortcut.weight: copying a param with shape torch.Size([640, 1920, 1, 1]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 1, 1]). size mismatch for up_blocks.1.resnets.0.conv_shortcut.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.norm1.weight: copying a param with shape torch.Size([1280]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.1.resnets.1.norm1.bias: copying a param with shape torch.Size([1280]) from checkpoint, the shape in current model is torch.Size([2560]). size mismatch for up_blocks.1.resnets.1.conv1.weight: copying a param with shape torch.Size([640, 1280, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 3, 3]). size mismatch for up_blocks.1.resnets.1.conv1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.time_emb_proj.weight: copying a param with shape torch.Size([640, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.resnets.1.time_emb_proj.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.conv2.weight: copying a param with shape torch.Size([640, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 3, 3]). size mismatch for up_blocks.1.resnets.1.conv2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.1.conv_shortcut.weight: copying a param with shape torch.Size([640, 1280, 1, 1]) from checkpoint, the shape in current model is torch.Size([1280, 2560, 1, 1]). size mismatch for up_blocks.1.resnets.1.conv_shortcut.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.norm1.weight: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([1920]). size mismatch for up_blocks.1.resnets.2.norm1.bias: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([1920]). size mismatch for up_blocks.1.resnets.2.conv1.weight: copying a param with shape torch.Size([640, 960, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 1920, 3, 3]). size mismatch for up_blocks.1.resnets.2.conv1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.time_emb_proj.weight: copying a param with shape torch.Size([640, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280]). size mismatch for up_blocks.1.resnets.2.time_emb_proj.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.norm2.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.norm2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.conv2.weight: copying a param with shape torch.Size([640, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 3, 3]). size mismatch for up_blocks.1.resnets.2.conv2.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.resnets.2.conv_shortcut.weight: copying a param with shape torch.Size([640, 960, 1, 1]) from checkpoint, the shape in current model is torch.Size([1280, 1920, 1, 1]). size mismatch for up_blocks.1.resnets.2.conv_shortcut.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.1.upsamplers.0.conv.weight: copying a param with shape torch.Size([640, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 3, 3]). size mismatch for up_blocks.1.upsamplers.0.conv.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.2.resnets.0.norm1.weight: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([1920]). size mismatch for up_blocks.2.resnets.0.norm1.bias: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([1920]). size mismatch for up_blocks.2.resnets.0.conv1.weight: copying a param with shape torch.Size([320, 960, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 1920, 3, 3]). size mismatch for up_blocks.2.resnets.0.conv1.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.0.time_emb_proj.weight: copying a param with shape torch.Size([320, 1280]) from checkpoint, the shape in current model is torch.Size([640, 1280]). size mismatch for up_blocks.2.resnets.0.time_emb_proj.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.0.norm2.weight: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.0.norm2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.0.conv2.weight: copying a param with shape torch.Size([320, 320, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 640, 3, 3]). size mismatch for up_blocks.2.resnets.0.conv2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.0.conv_shortcut.weight: copying a param with shape torch.Size([320, 960, 1, 1]) from checkpoint, the shape in current model is torch.Size([640, 1920, 1, 1]). size mismatch for up_blocks.2.resnets.0.conv_shortcut.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.norm1.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.2.resnets.1.norm1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([1280]). size mismatch for up_blocks.2.resnets.1.conv1.weight: copying a param with shape torch.Size([320, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 1280, 3, 3]). size mismatch for up_blocks.2.resnets.1.conv1.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.time_emb_proj.weight: copying a param with shape torch.Size([320, 1280]) from checkpoint, the shape in current model is torch.Size([640, 1280]). size mismatch for up_blocks.2.resnets.1.time_emb_proj.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.norm2.weight: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.norm2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.conv2.weight: copying a param with shape torch.Size([320, 320, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 640, 3, 3]). size mismatch for up_blocks.2.resnets.1.conv2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.1.conv_shortcut.weight: copying a param with shape torch.Size([320, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([640, 1280, 1, 1]). size mismatch for up_blocks.2.resnets.1.conv_shortcut.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.norm1.weight: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([960]). size mismatch for up_blocks.2.resnets.2.norm1.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([960]). size mismatch for up_blocks.2.resnets.2.conv1.weight: copying a param with shape torch.Size([320, 640, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 960, 3, 3]). size mismatch for up_blocks.2.resnets.2.conv1.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.time_emb_proj.weight: copying a param with shape torch.Size([320, 1280]) from checkpoint, the shape in current model is torch.Size([640, 1280]). size mismatch for up_blocks.2.resnets.2.time_emb_proj.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.norm2.weight: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.norm2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.conv2.weight: copying a param with shape torch.Size([320, 320, 3, 3]) from checkpoint, the shape in current model is torch.Size([640, 640, 3, 3]). size mismatch for up_blocks.2.resnets.2.conv2.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for up_blocks.2.resnets.2.conv_shortcut.weight: copying a param with shape torch.Size([320, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([640, 960, 1, 1]). size mismatch for up_blocks.2.resnets.2.conv_shortcut.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for mid_block.attentions.0.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for mid_block.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for mid_block.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 2048]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for mid_block.attentions.0.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 996, in main() File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 992, in main launch_command(args) File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 986, in launch_command simple_launcher(args) File "/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 628, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/venv/bin/python', '/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=/home/alex/.local/share/krita/pykrita/ai_diffusion/.server/ComfyUI/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors', '--train_data_dir=/home/alex/database', '--output_dir=models/loras', '--logging_dir=./logs', '--log_prefix=myimages', '--resolution=512,512', '--network_module=networks.lora', '--max_train_epochs=40', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=32', '--network_alpha=32', '--output_name=myimages', '--train_batch_size=1', '--save_every_n_epochs=10', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=2', '--cache_latents', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--min_bucket_reso=256', '--max_bucket_reso=1584', '--keep_tokens=0', '--xformers', '--shuffle_caption', '--clip_skip=2', '--optimizer_type=AdamW8bit', '--persistent_data_loader_workers', '--log_with=tensorboard']' returned non-zero exit status 1. Train finished Prompt executed in 9.06 seconds ```
RikkB commented 6 months ago

Just came across this repo, so I haven't yet tried it. However, SDXL is typically based on 1024x1024 images. I've seen references to 768x768, but most everything references 1024x1024. I know when I made my first SDXL checkpoint, it required 1024x1024 images

xenogenesi commented 6 months ago

@RikkB thanks, I'm not an expert but yes I remember the same, 1024 for SDXL, but I've seen tutorials for both kohya (never used) and this node for which it is not essential to resize the images, useful for performance but not essential, they should be resized automatically during the process, however, just in case, I *resized all the images to 1024x1024, same error. Anyone having success with SDXL?

urstrulybala commented 6 months ago

Hi, has anyone got this working on sdxl? Kindly confirm.

arlechinu commented 5 months ago

Will try some different size images as sources if that helps… Anyone else tried anything to get this working for SDXL?