guoyww / AnimateDiff

Official implementation of AnimateDiff.
https://animatediff.github.io
Apache License 2.0
10.32k stars 844 forks source link

SparseCtrl-RGB causes video interpolation to flash due to bad colors #387

Open aihopper opened 6 days ago

aihopper commented 6 days ago

This affects both video interpolation and video prediction

Here is a minimal repro case that attempts to interpolate every other frame. Notice the flashing frames:

https://github.com/user-attachments/assets/e2784c49-18b7-4254-a127-286c94c83f10

Repro steps

  1. First I used Animatediff to generate a video of a running man: video

  2. Then I take the generated video (which should be well understood by AnimateDiff), remove all the odd frames

  3. Use AnimateDiff to generate the frames I removed by interpolating the present frames, here is the yaml file:

    # 3-prediction
    - adapter_lora_scale: 1.0
    adapter_lora_path: "models/Motion_Module/v3_sd15_adapter.ckpt"
    dreambooth_path:   ""
    
    inference_config: "configs/inference/inference-v3.yaml"
    motion_module:    "models/Motion_Module/v3_sd15_mm.ckpt"
    
    controlnet_config: "configs/inference/sparsectrl/latent_condition.yaml"
    controlnet_path:   "models/SparseCtrl/v3_sd15_sparsectrl_rgb.ckpt"
    
    H: 512
    W: 512
    seed:           45
    steps:          25
    guidance_scale: 8.5
    
    controlnet_image_indexs: [0,2,4,6,8,10,12,14]
    
    controlnet_images:
    - "running/readme_001.png"
    - "running/readme_003.png"
    - "running/readme_005.png"
    - "running/readme_007.png"
    - "running/readme_009.png"
    - "running/readme_011.png"
    - "running/readme_013.png"
    - "running/readme_015.png"
    
    prompt:
    - "man, full shot, running in a white suit, brown shoes, gray background, high quality, detailed"
    
    n_prompt:
    - "worst quality, low quality, letterboxed"

Notes:

aihopper commented 5 days ago

The main differences seem to be along the edges, the generated image seem to have softer edges

image

Also histograms show notable differencies, the below image is the generated one (brighter/blurrier)

image