Open HyoKong opened 4 months ago
Hi, Thanks for your question! Like what we have mentioned in the Supplementary Document of the paper, we can modify the cfg of two conditions to allow more flexibility/controllability for the model. Using --multiple_cond_cfg
and --cfg_img
can modify the cfg of image condition. We will use cfg=7.5 for both conditions (text and image) without --multiple_cond_cfg
.
Thank you for the great work!
In the 4.1 Implementation Details part of your paper, you claim that there are two guidance scales for text-conditioned image animation. I notice that in your released run.sh code, you commit the
--multiple_cond_cfg
. Is there any difference with or without--multiple_cond_cfg
and will the performance be better without--multiple_cond_cfg
?Thank you so much for the help!