kijai / ComfyUI-MimicMotionWrapper

Apache License 2.0
310 stars 25 forks source link

Error occurred when executing DownloadAndLoadMimicMotionModel ,I Executed same Workflow without changing any parameters. Got the error below. #65

Open chiragjain777 opened 2 months ago

chiragjain777 commented 2 months ago

Error occurred when executing DownloadAndLoadMimicMotionModel:

Cannot load from /home/chirag/ComfyUI/models/diffusers/stable-video-diffusion-img2vid-xt-1-1 because the following keys are missing: decoder.up_blocks.0.resnets.2.spatial_res_block.conv2.weight, decoder.up_blocks.2.resnets.1.spatial_res_block.conv1.bias, decoder.mid_block.resnets.0.spatial_res_block.conv1.bias, decoder.up_blocks.1.resnets.2.spatial_res_block.conv1.weight, decoder.mid_block.resnets.1.temporal_res_block.conv2.bias, decoder.up_blocks.1.resnets.2.temporal_res_block.conv2.bias, decoder.up_blocks.0.resnets.0.time_mixer.mix_factor, decoder.up_blocks.2.resnets.1.spatial_res_block.conv2.bias, decoder.up_blocks.2.resnets.0.spatial_res_block.norm1.weight, decoder.up_blocks.0.resnets.2.spatial_res_block.norm2.weight, decoder.up_blocks.3.resnets.0.spatial_res_block.conv1.bias, decoder.mid_block.resnets.1.temporal_res_block.conv1.weight, decoder.up_blocks.1.resnets.2.temporal_res_block.norm2.bias, decoder.up_blocks.2.resnets.2.spatial_res_block.conv2.bias, decoder.mid_block.resnets.1.spatial_res_block.conv2.bias, decoder.up_blocks.2.resnets.2.temporal_res_block.conv2.bias, decoder.up_blocks.1.resnets.2.temporal_res_block.norm2.weight, decoder.up_blocks.2.resnets.2.spatial_res_block.norm2.weight, decoder.up_blocks.2.resnets.0.temporal_res_block.conv2.bias, decoder.up_blocks.3.resnets.1.spatial_res_block.conv2.bias, decoder.mid_block.resnets.1.spatial_res_block.conv2.weight, decoder.up_blocks.2.resnets.0.spatial_res_block.norm2.bias, decoder.up_blocks.2.resnets.0.spatial_res_block.conv2.weight, decoder.up_blocks.2.resnets.1.spatial_res_block.conv1.weight, decoder.up_blocks.3.resnets.0.spatial_res_block.conv1.weight, decoder.up_blocks.1.resnets.0.temporal_res_block.norm2.bias, decoder.mid_block.resnets.1.temporal_res_block.norm2.bias, decoder.up_blocks.1.resnets.2.temporal_res_block.conv1.weight, decoder.up_blocks.1.resnets.1.spatial_res_block.norm2.weight, decoder.mid_block.resnets.0.spatial_res_block.norm2.weight, decoder.up_blocks.1.resnets.2.spatial_res_block.norm2.weight, decoder.up_blocks.0.resnets.2.temporal_res_block.norm2.bias, decoder.up_blocks.3.resnets.1.spatial_res_block.norm1.bias, decoder.up_blocks.2.resnets.0.spatial_res_block.conv_shortcut.bias, decoder.up_blocks.2.resnets.0.temporal_res_block.conv2.weight, decoder.up_blocks.1.resnets.2.spatial_res_block.conv2.bias, decoder.up_blocks.1.resnets.0.temporal_res_block.norm1.bias, decoder.up_blocks.2.resnets.0.temporal_res_block.norm1.weight, decoder.up_blocks.0.resnets.1.temporal_res_block.conv1.bias, decoder.mid_block.resnets.0.spatial_res_block.conv1.weight, decoder.up_blocks.0.resnets.1.spatial_res_block.conv2.weight, decoder.up_blocks.3.resnets.2.temporal_res_block.norm1.bias, decoder.mid_block.resnets.0.temporal_res_block.conv1.bias, decoder.up_blocks.1.resnets.0.spatial_res_block.norm2.bias, decoder.up_blocks.3.resnets.1.spatial_res_block.norm1.weight, decoder.up_blocks.3.resnets.1.temporal_res_block.norm2.weight, decoder.up_blocks.0.resnets.1.temporal_res_block.norm1.weight, decoder.up_blocks.3.resnets.0.spatial_res_block.norm2.weight, decoder.up_blocks.2.resnets.2.spatial_res_block.norm2.bias, decoder.up_blocks.2.resnets.0.temporal_res_block.norm1.bias, decoder.up_blocks.3.resnets.1.temporal_res_block.norm2.bias, decoder.up_blocks.2.resnets.0.spatial_res_block.conv_shortcut.weight, decoder.up_blocks.2.resnets.2.time_mixer.mix_factor, decoder.up_blocks.0.resnets.1.spatial_res_block.norm1.bias, decoder.up_blocks.0.resnets.2.spatial_res_block.conv1.weight, decoder.time_conv_out.bias, decoder.up_blocks.0.resnets.0.spatial_res_block.conv2.bias, decoder.up_blocks.1.resnets.1.temporal_res_block.conv1.bias, decoder.up_blocks.3.resnets.0.spatial_res_block.norm1.bias, decoder.mid_block.resnets.0.temporal_res_block.norm1.bias, decoder.up_blocks.3.resnets.2.spatial_res_block.conv2.weight, decoder.up_blocks.3.resnets.2.temporal_res_block.conv1.bias, decoder.mid_block.resnets.1.spatial_res_block.norm2.weight, decoder.up_blocks.0.resnets.1.spatial_res_block.norm1.weight, decoder.mid_block.resnets.0.time_mixer.mix_factor, decoder.mid_block.resnets.1.spatial_res_block.norm1.weight, decoder.up_blocks.3.resnets.2.spatial_res_block.norm1.weight, decoder.up_blocks.0.resnets.2.spatial_res_block.conv2.bias, decoder.up_blocks.3.resnets.2.temporal_res_block.norm2.weight, decoder.up_blocks.0.resnets.0.spatial_res_block.norm1.weight, decoder.up_blocks.2.resnets.2.spatial_res_block.conv1.weight, decoder.up_blocks.1.resnets.2.spatial_res_block.conv1.bias, decoder.up_blocks.0.resnets.1.time_mixer.mix_factor, decoder.up_blocks.2.resnets.1.spatial_res_block.conv2.weight, decoder.up_blocks.3.resnets.2.spatial_res_block.norm1.bias, decoder.mid_block.resnets.0.spatial_res_block.conv2.weight, decoder.up_blocks.0.resnets.2.temporal_res_block.conv2.bias, decoder.up_blocks.0.resnets.0.spatial_res_block.conv1.bias, decoder.up_blocks.1.resnets.2.temporal_res_block.norm1.weight, decoder.up_blocks.3.resnets.0.spatial_res_block.conv_shortcut.bias, decoder.up_blocks.0.resnets.0.spatial_res_block.conv2.weight, decoder.up_blocks.2.resnets.0.time_mixer.mix_factor, decoder.up_blocks.2.resnets.2.temporal_res_block.norm2.bias, decoder.up_blocks.3.resnets.2.temporal_res_block.conv2.weight, decoder.up_blocks.3.resnets.1.spatial_res_block.norm2.weight, decoder.up_blocks.0.resnets.0.temporal_res_block.norm1.weight, decoder.up_blocks.1.resnets.0.temporal_res_block.conv2.bias, decoder.up_blocks.1.resnets.0.temporal_res_block.norm2.weight, decoder.up_blocks.3.resnets.2.spatial_res_block.conv1.weight, decoder.up_blocks.2.resnets.0.spatial_res_block.norm2.weight, decoder.up_blocks.3.resnets.0.spatial_res_block.conv2.weight, decoder.up_blocks.1.resnets.1.spatial_res_block.norm1.bias, decoder.up_blocks.0.resnets.2.spatial_res_block.norm1.bias, decoder.up_blocks.3.resnets.1.temporal_res_block.conv2.bias, decoder.up_blocks.3.resnets.0.temporal_res_block.conv2.weight, decoder.up_blocks.0.resnets.0.spatial_res_block.norm1.bias, decoder.up_blocks.0.resnets.0.temporal_res_block.conv2.bias, decoder.up_blocks.3.resnets.2.temporal_res_block.norm2.bias, decoder.up_blocks.0.resnets.1.temporal_res_block.norm2.bias, decoder.up_blocks.1.resnets.0.spatial_res_block.conv2.weight, decoder.up_blocks.2.resnets.1.temporal_res_block.norm2.weight, decoder.mid_block.resnets.0.temporal_res_block.conv2.weight, decoder.up_blocks.1.resnets.0.spatial_res_block.norm1.weight, decoder.up_blocks.3.resnets.2.temporal_res_block.conv1.weight, decoder.up_blocks.3.resnets.2.spatial_res_block.norm2.weight, decoder.up_blocks.0.resnets.1.spatial_res_block.conv2.bias, decoder.up_blocks.0.resnets.1.spatial_res_block.conv1.bias, decoder.up_blocks.2.resnets.0.spatial_res_block.conv1.weight, decoder.up_blocks.2.resnets.1.spatial_res_block.norm1.weight, decoder.up_blocks.3.resnets.0.time_mixer.mix_factor, decoder.mid_block.resnets.0.spatial_res_block.conv2.bias, decoder.up_blocks.1.resnets.2.temporal_res_block.conv2.weight, decoder.up_blocks.2.resnets.0.spatial_res_block.conv1.bias, decoder.up_blocks.3.resnets.1.temporal_res_block.conv2.weight, decoder.up_blocks.2.resnets.2.temporal_res_block.norm1.bias, decoder.mid_block.resnets.0.temporal_res_block.conv1.weight, decoder.up_blocks.2.resnets.1.spatial_res_block.norm1.bias, decoder.up_blocks.1.resnets.0.spatial_res_block.norm1.bias, decoder.up_blocks.2.resnets.1.temporal_res_block.norm2.bias, decoder.up_blocks.3.resnets.2.spatial_res_block.conv1.bias, decoder.up_blocks.2.resnets.0.spatial_res_block.conv2.bias, decoder.up_blocks.0.resnets.0.spatial_res_block.norm2.weight, decoder.up_blocks.3.resnets.2.temporal_res_block.norm1.weight, decoder.up_blocks.2.resnets.1.temporal_res_block.conv1.weight, decoder.mid_block.resnets.1.spatial_res_block.norm2.bias, decoder.up_blocks.2.resnets.1.spatial_res_block.norm2.weight, decoder.up_blocks.0.resnets.2.time_mixer.mix_factor, decoder.up_blocks.3.resnets.0.temporal_res_block.conv1.weight, decoder.up_blocks.0.resnets.1.temporal_res_block.conv1.weight, decoder.up_blocks.2.resnets.2.temporal_res_block.norm2.weight, decoder.up_blocks.0.resnets.1.temporal_res_block.norm2.weight, decoder.up_blocks.2.resnets.2.spatial_res_block.conv1.bias, decoder.up_blocks.2.resnets.0.temporal_res_block.norm2.bias, decoder.up_blocks.1.resnets.1.spatial_res_block.norm1.weight, decoder.up_blocks.1.resnets.0.temporal_res_block.conv1.weight, decoder.mid_block.resnets.0.temporal_res_block.norm2.bias, decoder.up_blocks.2.resnets.0.temporal_res_block.norm2.weight, decoder.up_blocks.3.resnets.0.temporal_res_block.norm2.bias, decoder.mid_block.resnets.1.spatial_res_block.conv1.bias, decoder.up_blocks.0.resnets.1.temporal_res_block.conv2.bias, decoder.up_blocks.0.resnets.0.temporal_res_block.norm2.weight, decoder.up_blocks.2.resnets.1.temporal_res_block.conv1.bias, decoder.up_blocks.1.resnets.1.temporal_res_block.norm2.bias, decoder.up_blocks.3.resnets.0.temporal_res_block.norm1.weight, decoder.up_blocks.2.resnets.2.spatial_res_block.conv2.weight, decoder.up_blocks.1.resnets.1.spatial_res_block.norm2.bias, decoder.up_blocks.1.resnets.2.spatial_res_block.norm2.bias, decoder.up_blocks.1.resnets.0.spatial_res_block.conv1.weight, decoder.up_blocks.3.resnets.1.spatial_res_block.conv1.bias, decoder.up_blocks.3.resnets.0.spatial_res_block.conv2.bias, decoder.up_blocks.3.resnets.2.spatial_res_block.conv2.bias, decoder.up_blocks.0.resnets.2.spatial_res_block.conv1.bias, decoder.up_blocks.2.resnets.1.spatial_res_block.norm2.bias, decoder.up_blocks.1.resnets.0.spatial_res_block.conv1.bias, decoder.up_blocks.1.resnets.2.spatial_res_block.norm1.bias, decoder.up_blocks.0.resnets.0.temporal_res_block.norm2.bias, decoder.up_blocks.2.resnets.2.spatial_res_block.norm1.weight, decoder.up_blocks.3.resnets.1.spatial_res_block.conv1.weight, decoder.up_blocks.3.resnets.0.temporal_res_block.conv1.bias, decoder.up_blocks.3.resnets.2.spatial_res_block.norm2.bias, decoder.up_blocks.3.resnets.0.temporal_res_block.conv2.bias, decoder.up_blocks.1.resnets.0.time_mixer.mix_factor, decoder.mid_block.resnets.0.temporal_res_block.conv2.bias, decoder.up_blocks.1.resnets.1.temporal_res_block.conv1.weight, decoder.up_blocks.2.resnets.1.time_mixer.mix_factor, decoder.up_blocks.3.resnets.1.time_mixer.mix_factor, decoder.mid_block.resnets.1.time_mixer.mix_factor, decoder.up_blocks.1.resnets.2.temporal_res_block.conv1.bias, decoder.up_blocks.0.resnets.2.temporal_res_block.norm1.weight, decoder.up_blocks.3.resnets.1.spatial_res_block.conv2.weight, decoder.up_blocks.3.resnets.1.temporal_res_block.conv1.bias, decoder.up_blocks.1.resnets.1.spatial_res_block.conv1.weight, decoder.up_blocks.1.resnets.0.temporal_res_block.conv2.weight, decoder.up_blocks.0.resnets.1.spatial_res_block.conv1.weight, decoder.mid_block.resnets.1.temporal_res_block.conv2.weight, decoder.up_blocks.1.resnets.1.temporal_res_block.conv2.bias, decoder.up_blocks.1.resnets.2.spatial_res_block.conv2.weight, decoder.up_blocks.2.resnets.0.spatial_res_block.norm1.bias, decoder.up_blocks.2.resnets.0.temporal_res_block.conv1.weight, decoder.up_blocks.3.resnets.1.temporal_res_block.norm1.weight, decoder.up_blocks.0.resnets.1.spatial_res_block.norm2.weight, decoder.up_blocks.0.resnets.0.spatial_res_block.norm2.bias, decoder.up_blocks.3.resnets.0.temporal_res_block.norm2.weight, decoder.mid_block.resnets.1.temporal_res_block.norm2.weight, decoder.up_blocks.0.resnets.1.temporal_res_block.conv2.weight, decoder.up_blocks.1.resnets.0.spatial_res_block.norm2.weight, decoder.up_blocks.1.resnets.0.temporal_res_block.conv1.bias, decoder.up_blocks.0.resnets.2.temporal_res_block.conv1.weight, decoder.up_blocks.2.resnets.2.temporal_res_block.conv1.bias, decoder.mid_block.resnets.0.spatial_res_block.norm1.weight, decoder.up_blocks.1.resnets.1.temporal_res_block.norm2.weight, decoder.up_blocks.1.resnets.1.temporal_res_block.norm1.bias, decoder.up_blocks.1.resnets.1.spatial_res_block.conv2.weight, decoder.up_blocks.1.resnets.2.time_mixer.mix_factor, decoder.mid_block.resnets.1.temporal_res_block.norm1.bias, decoder.mid_block.resnets.0.temporal_res_block.norm1.weight, decoder.up_blocks.2.resnets.1.temporal_res_block.norm1.weight, decoder.up_blocks.0.resnets.2.temporal_res_block.conv1.bias, decoder.up_blocks.0.resnets.2.temporal_res_block.conv2.weight, decoder.mid_block.resnets.0.spatial_res_block.norm1.bias, decoder.up_blocks.2.resnets.0.temporal_res_block.conv1.bias, decoder.up_blocks.0.resnets.2.spatial_res_block.norm2.bias, decoder.mid_block.resnets.1.spatial_res_block.conv1.weight, decoder.up_blocks.3.resnets.0.spatial_res_block.norm1.weight, decoder.time_conv_out.weight, decoder.up_blocks.1.resnets.0.spatial_res_block.conv2.bias, decoder.mid_block.resnets.1.temporal_res_block.norm1.weight, decoder.up_blocks.1.resnets.2.spatial_res_block.norm1.weight, decoder.up_blocks.0.resnets.2.spatial_res_block.norm1.weight, decoder.up_blocks.1.resnets.0.temporal_res_block.norm1.weight, decoder.up_blocks.3.resnets.0.temporal_res_block.norm1.bias, decoder.up_blocks.0.resnets.0.temporal_res_block.conv2.weight, decoder.up_blocks.2.resnets.2.spatial_res_block.norm1.bias, decoder.up_blocks.3.resnets.2.time_mixer.mix_factor, decoder.up_blocks.3.resnets.1.temporal_res_block.conv1.weight, decoder.up_blocks.2.resnets.1.temporal_res_block.conv2.weight, decoder.up_blocks.2.resnets.1.temporal_res_block.conv2.bias, decoder.up_blocks.1.resnets.2.temporal_res_block.norm1.bias, decoder.up_blocks.0.resnets.0.temporal_res_block.norm1.bias, decoder.mid_block.resnets.1.spatial_res_block.norm1.bias, decoder.up_blocks.0.resnets.1.spatial_res_block.norm2.bias, decoder.mid_block.resnets.0.spatial_res_block.norm2.bias, decoder.mid_block.resnets.0.temporal_res_block.norm2.weight, decoder.mid_block.resnets.1.temporal_res_block.conv1.bias, decoder.up_blocks.0.resnets.0.spatial_res_block.conv1.weight, decoder.up_blocks.0.resnets.0.temporal_res_block.conv1.bias, decoder.up_blocks.3.resnets.0.spatial_res_block.conv_shortcut.weight, decoder.up_blocks.2.resnets.2.temporal_res_block.norm1.weight, decoder.up_blocks.1.resnets.1.temporal_res_block.norm1.weight, decoder.up_blocks.2.resnets.1.temporal_res_block.norm1.bias, decoder.up_blocks.3.resnets.1.spatial_res_block.norm2.bias, decoder.up_blocks.0.resnets.2.temporal_res_block.norm2.weight, decoder.up_blocks.0.resnets.1.temporal_res_block.norm1.bias, decoder.up_blocks.0.resnets.2.temporal_res_block.norm1.bias, decoder.up_blocks.3.resnets.1.temporal_res_block.norm1.bias, decoder.up_blocks.3.resnets.0.spatial_res_block.norm2.bias, decoder.up_blocks.1.resnets.1.temporal_res_block.conv2.weight, decoder.up_blocks.1.resnets.1.time_mixer.mix_factor, decoder.up_blocks.1.resnets.1.spatial_res_block.conv1.bias, decoder.up_blocks.2.resnets.2.temporal_res_block.conv1.weight, decoder.up_blocks.0.resnets.0.temporal_res_block.conv1.weight, decoder.up_blocks.3.resnets.2.temporal_res_block.conv2.bias, decoder.up_blocks.2.resnets.2.temporal_res_block.conv2.weight, decoder.up_blocks.1.resnets.1.spatial_res_block.conv2.bias. Please make sure to pass low_cpu_mem_usage=False and device_map=None if you want to randomly initialize those weights or else make sure your checkpoint file is correct.

File "/home/chirag/ComfyUI/execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "/home/chirag/ComfyUI/execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "/home/chirag/ComfyUI/execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(*slice_dict(input_data_all, i))) File "/home/chirag/ComfyUI/custom_nodes/ComfyUI-MimicMotionWrapper/nodes.py", line 131, in loadmodel self.vae = AutoencoderKLTemporalDecoder.from_pretrained(svd_path, subfolder="vae", variant="fp16", low_cpu_mem_usage=True).to(dtype).to(device).eval() File "/home/chirag/anaconda3/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn return fn(args, **kwargs) File "/home/chirag/anaconda3/lib/python3.9/site-packages/diffusers/models/modeling_utils.py", line 750, in from_pretrained raise ValueError( Screenshot from 2024-08-06 22-42-51

HiddenPeak commented 2 months ago

Same

Zynster commented 1 week ago

Yes, I'm also getting this