G-U-N / AnimateLCM

AnimateLCM: Let's Accelerate the Video Generation within 4 Steps!
https://animatelcm.github.io
MIT License
567 stars 41 forks source link

No position encoder in your huggingface sd15 model weights #31

Open continue-revolution opened 1 month ago

continue-revolution commented 1 month ago

Someone asked me to support your work in AUTOMATIC1111 SD WebUI, and I have some questions.

There is no position encoder found in https://huggingface.co/wangfuyun/AnimateLCM/resolve/main/AnimateLCM_sd15_t2v.ckpt?download=true

However, in your diffusers weights https://huggingface.co/wangfuyun/AnimateLCM/resolve/main/diffusion_pytorch_model.fp16.safetensors?download=true, I do find position encoder, but it's incomplete. Typically in mid_blocks there are 2 position encoders, but in your diffusers model, there is only one.

It seems to me that you did require position encoding in your model because I saw https://github.com/G-U-N/AnimateLCM/blob/master/animatelcm_sd15/configs/inference-t2v.yaml#L23

I would appreciate if anyone can explain to me the difference between your model and the original AnimateDiff model architecture, or anyone can fix this issue by updating your weights on huggingface.

G-U-N commented 1 month ago

I did try different archs but the released version is generally identical to animatediff.

For your question, the position embeddings are just some fixed hyper-parameters instead of trained weight and therefore I delete it in my code base and my weights. You won't get error if use my code.

The diffusers weight are converted by diffusers team. I am not exactly sure about the correctness.

You may also refer to https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved, which supports the loading of AnimateLCM related weights.