thu-ml / cond-image-leakage

Official implementation for "Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model" (NeurIPS 2024)
Apache License 2.0
212 stars 25 forks source link

About compromising other performance #6

Closed CIntellifusion closed 2 months ago

CIntellifusion commented 2 months ago

Thanks for your excellent work. From the demo videos, we can see the other aspects like image quality is not affected. But I still wondering whether there would be quantitative experiments to further validate how the performance on the other aspects changes? thanks for your reply.

zhuhz22 commented 2 months ago

Hi @CIntellifusion , Thank you for your attention to our work! Actually, there is a certain trade-off between dynamics, temporal consistency, and the quality of generation. In our inference strategy, if the initial M is set appropriately, it is possible to enhance motion without compromising image alignment and temporal consistency. As for the quantitative comparison, in our paper, we validated these aspects in the user study where M is properly set: image However, if M is set too low, it will impact temporal consistency and the quality of generation. You can try experimenting with the code yourself to get a feel for it.

CIntellifusion commented 2 months ago

Hi! It is really hard to achieve better performance in one dimension while maintain the same performance in other dimension, but the user study has basically proved the gained performance is larger than lost performance. Thanks for providing a new perspective of image leakage! REALLY NICE WORK.