tumurzakov / AnimateDiff

AnimationDiff with train
Apache License 2.0
111 stars 28 forks source link

Mismatch diffusers version & Wrong VAE checkpoint loading on 22.09.2023 #12

Open yitongx opened 11 months ago

yitongx commented 11 months ago

Hi, first of all thanks for sharing your improved codes!

While I was testing your jupyter notebook on CoLab, I have the following problems:

Firstly, I would like to know if you are certain that diffusers==0.11.0 can successfully run the codes? Since from diffusers.models.attention_processor import AttentionProcessor, AttnProcessor here is requiring version >= 0.14.0 at least. Currently I'm using diffusers==0.21.0 instead and only need to modify one import from diffusers.utils.torch_utils import is_compiled_module here. And after installing some missed packages like wandb, compel, ffmpeg, I can successfully finish the training part.

yitongx commented 11 months ago

Secondly, under diffusers==0.21.0, I failed at final inference step. I got an unexpected VAE checkpoint loading error:

=================================================================== `RuntimeError Traceback (most recent call last) in <cell line: 63>() 111 # vae 112 converted_vae_checkpoint = convert_ldm_vae_checkpoint(base_state_dict, pipeline.vae.config) --> 113 pipeline.vae.load_state_dict(converted_vae_checkpoint) 114 # unet 115 converted_unet_checkpoint = convert_ldm_unet_checkpoint(base_state_dict, pipeline.unet.config)

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict) 2039 2040 if len(error_msgs) > 0: -> 2041 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( 2042 self.class.name, "\n\t".join(error_msgs))) 2043 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for AutoencoderKL: Missing key(s) in state_dict: "encoder.mid_block.attentions.0.to_q.weight", "encoder.mid_block.attentions.0.to_q.bias", "encoder.mid_block.attentions.0.to_k.weight", "encoder.mid_block.attentions.0.to_k.bias", "encoder.mid_block.attentions.0.to_v.weight", "encoder.mid_block.attentions.0.to_v.bias", "encoder.mid_block.attentions.0.to_out.0.weight", "encoder.mid_block.attentions.0.to_out.0.bias", "decoder.mid_block.attentions.0.to_q.weight", "decoder.mid_block.attentions.0.to_q.bias", "decoder.mid_block.attentions.0.to_k.weight", "decoder.mid_block.attentions.0.to_k.bias", "decoder.mid_block.attentions.0.to_v.weight", "decoder.mid_block.attentions.0.to_v.bias", "decoder.mid_block.attentions.0.to_out.0.weight", "decoder.mid_block.attentions.0.to_out.0.bias". Unexpected key(s) in state_dict: "encoder.mid_block.attentions.0.key.bias", "encoder.mid_block.attentions.0.key.weight", "encoder.mid_block.attentions.0.proj_attn.bias", "encoder.mid_block.attentions.0.proj_attn.weight", "encoder.mid_block.attentions.0.query.bias", "encoder.mid_block.attentions.0.query.weight", "encoder.mid_block.attentions.0.value.bias", "encoder.mid_block.attentions.0.value.weight", "decoder.mid_block.attentions.0.key.bias", "decoder.mid_block.attentions.0.key.weight", "decoder.mid_block.attentions.0.proj_attn.bias", "decoder.mid_block.attentions.0.proj_attn.weight", "decoder.mid_block.attentions.0.query.bias", "decoder.mid_block.attentions.0.query.weight", "decoder.mid_block.attentions.0.value.bias", "decoder.mid_block.attentions.0.value.weight".

===================================================================

This also happened when I tried to reproduce official inference results using the yaml configs.

I'm not familiar with your codes. Compare to original AnimateDiff, did you modify the VAE model structure somewhere? Or maybe I overlooked something.

Thanks in advance and best regards.