No position encoder in your huggingface sd15 model weights

G-U-N / AnimateLCM

[SIGGRAPH ASIA 2024 TCS] AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data

MIT License

612 stars 45 forks source link

Someone asked me to support your work in AUTOMATIC1111 SD WebUI, and I have some questions.

There is no position encoder found in https://huggingface.co/wangfuyun/AnimateLCM/resolve/main/AnimateLCM_sd15_t2v.ckpt?download=true

However, in your diffusers weights https://huggingface.co/wangfuyun/AnimateLCM/resolve/main/diffusion_pytorch_model.fp16.safetensors?download=true, I do find position encoder, but it's incomplete. Typically in mid_blocks there are 2 position encoders, but in your diffusers model, there is only one.

It seems to me that you did require position encoding in your model because I saw https://github.com/G-U-N/AnimateLCM/blob/master/animatelcm_sd15/configs/inference-t2v.yaml#L23

I would appreciate if anyone can explain to me the difference between your model and the original AnimateDiff model architecture, or anyone can fix this issue by updating your weights on huggingface.

G-U-N / AnimateLCM

No position encoder in your huggingface sd15 model weights #31