TianxingWu / FreeInit

[ECCV 2024] FreeInit: Bridging Initialization Gap in Video Diffusion Models
https://tianxingwu.github.io/pages/FreeInit/
MIT License
481 stars 22 forks source link

Work with SDXL? #8

Open sunhengzhe opened 9 months ago

sunhengzhe commented 9 months ago

Impressive examples! Does it support SDXL?

TianxingWu commented 9 months ago

Have not tested on the SDXL-beta version of AnimateDiff yet, but theoretically there would be no big difference. I took a quick look and found that the code of that branch is mostly identical to the main branch, so I suppose you can directly copy and paste part of the example code here (pipeline_animation.py, animate_with_freeinit.py) to that branch to see whether it works out.

sunhengzhe commented 9 months ago

@TianxingWu Thanks for the reply! I tried it today and it doesn't seem that easy. I started by copying the pipeline_animation.py, animate_with_freeinit.py and freeinit_utils.py as you say. Then because of the change in the AnimateSDXL parameters, I modified here

# scripts/animate_with_freeinit.py
pipeline = load_weights(
    pipeline,
    motion_module_path         = motion_module,
    ckpt_path                  = model_config.get("dreambooth_path", ""),
    lora_path                  = model_config.get("lora_model_path", ""),
    lora_alpha                 = model_config.get("lora_alpha", 0.8),
).to("cuda")

But when I run the script, it displayed error:

  0%|                                                                                    | 0/25 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/workspace/miniconda/envs/animatediff/lib/python3.10/site-packages/einops/einops.py", line 522, in reduce
    recipe = _prepare_transformation_recipe(pattern, reduction, axes_names=tuple(axes_lengths), ndim=len(shape))
  File "/workspace/miniconda/envs/animatediff/lib/python3.10/site-packages/einops/einops.py", line 365, in _prepare_transformation_recipe
    raise EinopsError(f"Wrong shape: expected {len(left.composition)} dims. Received {ndim}-dim tensor.")
einops.EinopsError: Wrong shape: expected 1 dims. Received 0-dim tensor.
gladzhang commented 8 months ago

使用AnimateDiff的sdxl分支,跑freeinit代码,pipeline跑起来没有问题,但是由于Animatediff只能使用EulerDiscreteScheduler,而一旦使用DDIM就会效果奇差,但是freeinit使用EulerDiscreteScheduler,则会出纯黑图,想问下大佬知道里面的原因吗?或者能否请教下一些可能的改进方案,谢谢🙏

TianxingWu commented 8 months ago

@gladzhang From my experience, the problem probably lies in the implementation of using diffusion forward proces to get the noisy latents z_T. Specifically, the implementation of the add_noise function of EulerDiscreteScheduler is largely different from that of the DDIM scheduler, so you should be very careful to ensure the resulting z_T is at the same noisy level as in the training process. Though I'm not sure about your exact implementation, I would suggest you try to normalize the noisy_samples you get from EulerDiscreteScheduler->add_noise by:

noisy_samples /= torch.sqrt(1.0 + sigma ** 2.0)

Please feel free to provide more details of your implementation if this does not solve your problem.

gladzhang commented 8 months ago

@TianxingWu Thanks for your answer. I still meet the error to generate a gif of solid color. I try your suggestion but the things do not go well. I add the provided code in the end of the function of add_noise. It does not work. I am still trying other methods.