showlab / Tune-A-Video

[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
https://tuneavideo.github.io
Apache License 2.0
4.15k stars 377 forks source link

Unexpected results with Stable diffusion 2.1 #64

Closed jaykim9870 closed 1 year ago

jaykim9870 commented 1 year ago

Hi, I tested your code with both stable diffusion 1.5 and 2.1, and reconstructed results are so different. (Stable diffusion 2.1 seems doing much worse) Here are examples with DAVIS dataset samples.

  1. car turn (sd2.1) sample-100 (sd1.5) sample-500(1 5)

  2. bike (sd2.1) sample-100_2_1 (sd1.5) sample-100_1_5_

  3. bear (sd2.1) sample-100_2_1 (sd1.5) sample-100_1_5

Could you explain if there is any reason why the stable diffusion 2.1 performs pool? Thanks for the your great work!

jaykim9870 commented 1 year ago

Turns out we didn't use accelerate. Thanks!