lllyasviel / ControlNet

Let us control diffusion models!
Apache License 2.0
28.97k stars 2.62k forks source link

Continuing training of a ControlNet #590

Open AmitMY opened 7 months ago

AmitMY commented 7 months ago

Following the tutorial I can successfully download SD, add ControlNet, and train it.

Now, I want to continue training the OpenPose model using a different pose estimation tool. I downloaded the bin file from here https://huggingface.co/lllyasviel/sd-controlnet-openpose/tree/main and did not add ControlNet (since it should be there), but then when I try the following:

model = create_model('./models/cldm_v15.yaml').cpu()
model.load_state_dict(load_state_dict('diffusion_pytorch_model.bin', location='cpu'))

I get this error (truncated because longer than 65K characters, full here)

        Missing key(s) in state_dict: "betas", "alphas_cumprod", "alphas_cumprod_prev", "sqrt_alphas_cumprod", "sqrt_one_minus_alphas_cumprod", "log_one_minus_alphas_cumprod", "sqrt_recip_alphas_cumprod", "sqrt_recipm1_alphas_cumprod", "posterior_variance", "posterior_log_variance_clipped", "posterior_mean_coef1", "posterior_mean_coef2", "logvar", "model.diffusion_model.time_embed.0.weight", "model.diffusion_model.time_embed.0.bias", "model.diffusion_model.time_embed.2.weight", "model.diffusion_model.time_embed.2.bias", 
        Unexpected key(s) in state_dict: "conv_in.weight", "conv_in.bias", "time_embedding.linear_1.weight", "time_embedding.linear_1.bias", "time_embedding.linear_2.weight", "time_embedding.linear_2.bias", "controlnet_cond_embedding.conv_in.weight", "controlnet_cond_embedding.conv_in.bias", "controlnet_cond_embedding.blocks.0.weight", "controlnet_cond_embedding.blocks.0.bias", "controlnet_cond_embedding.blocks.1.weight", "controlnet_cond_embedding.blocks.1.bias", "controlnet_cond_embedding.blocks.2.weight", "controlnet_cond_embedding.blocks.2.bias", "controlnet_cond_embedding.blocks.3.weight", "controlnet_cond_embedding.blocks.3.bias", "controlnet_cond_embedding.blocks.4.weight", "controlnet_cond_embedding.blocks.4.bias", "controlnet_cond_embedding.blocks.5.weight", "controlnet_cond_embedding.blocks.5.bias", "controlnet_cond_embedding.conv_out.weight", "controlnet_cond_embedding.conv_out.bias", "down_blocks.0.attentions.0.norm.weight", "down_blocks.0.attentions.0.norm.bias", "down_blocks.0.attentions.0.proj_in.weight", "down_blocks.0.attentions.0.proj_in.bias", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_q.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_k.weight", 

How should I correctly load the OpenPose model for continuing training?

Xelawk commented 6 months ago

Following the tutorial I can successfully download SD, add ControlNet, and train it.

Now, I want to continue training the OpenPose model using a different pose estimation tool. I downloaded the bin file from here https://huggingface.co/lllyasviel/sd-controlnet-openpose/tree/main and did not add ControlNet (since it should be there), but then when I try the following:

model = create_model('./models/cldm_v15.yaml').cpu()
model.load_state_dict(load_state_dict('diffusion_pytorch_model.bin', location='cpu'))

I get this error (truncated because longer than 65K characters, full here)

        Missing key(s) in state_dict: "betas", "alphas_cumprod", "alphas_cumprod_prev", "sqrt_alphas_cumprod", "sqrt_one_minus_alphas_cumprod", "log_one_minus_alphas_cumprod", "sqrt_recip_alphas_cumprod", "sqrt_recipm1_alphas_cumprod", "posterior_variance", "posterior_log_variance_clipped", "posterior_mean_coef1", "posterior_mean_coef2", "logvar", "model.diffusion_model.time_embed.0.weight", "model.diffusion_model.time_embed.0.bias", "model.diffusion_model.time_embed.2.weight", "model.diffusion_model.time_embed.2.bias", 
        Unexpected key(s) in state_dict: "conv_in.weight", "conv_in.bias", "time_embedding.linear_1.weight", "time_embedding.linear_1.bias", "time_embedding.linear_2.weight", "time_embedding.linear_2.bias", "controlnet_cond_embedding.conv_in.weight", "controlnet_cond_embedding.conv_in.bias", "controlnet_cond_embedding.blocks.0.weight", "controlnet_cond_embedding.blocks.0.bias", "controlnet_cond_embedding.blocks.1.weight", "controlnet_cond_embedding.blocks.1.bias", "controlnet_cond_embedding.blocks.2.weight", "controlnet_cond_embedding.blocks.2.bias", "controlnet_cond_embedding.blocks.3.weight", "controlnet_cond_embedding.blocks.3.bias", "controlnet_cond_embedding.blocks.4.weight", "controlnet_cond_embedding.blocks.4.bias", "controlnet_cond_embedding.blocks.5.weight", "controlnet_cond_embedding.blocks.5.bias", "controlnet_cond_embedding.conv_out.weight", "controlnet_cond_embedding.conv_out.bias", "down_blocks.0.attentions.0.norm.weight", "down_blocks.0.attentions.0.norm.bias", "down_blocks.0.attentions.0.proj_in.weight", "down_blocks.0.attentions.0.proj_in.bias", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_q.weight", "down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_k.weight", 

How should I correctly load the OpenPose model for continuing training?

have you fix?

AmitMY commented 6 months ago

I moved to using hugginface diffusers, doing this https://github.com/sign-language-processing/pose-to-video/blob/main/pose_to_video/conditional/controlnet/train.sh#L74-L94