Closed Shr1ftyy closed 1 year ago
hi @Shr1ftyy, we have some discussions on the pose control part in the sec. 4 of the paper. intuitively, we use T2I-Adapter as the pretrained T2I model, and do the editing based on pose condition.
Thanks for getting back! I realized I was reading an older version of the paper 🤦🏾♂️ . I'll close this issue with this comment.
Hello again, I was wondering if you have any plans to release examples which integrate ControlNet-OpenPose or T2I, etc. with Tune-A-Video for inference. If so, could you provide an estimate on when they may be released?
Thanks.
hi @Shr1ftyy, it was actually in my todo list. however, i was quite packed in the past few weeks, and did not manage to do this. feel free to open a PR if you want to contribute.
fyi, i spotted some following works (e.g., FollowYourPose) that have implemented the pose control, which is quite similar to ours.
hi @Shr1ftyy, we have some discussions on the pose control part in the sec. 4 of the paper. intuitively, we use T2I-Adapter as the pretrained T2I model, and do the editing based on pose condition.
Hi, In the paper, you claim that: " Our method can also be integrated with conditional T2I models like T2I-Adapter [29] and ControlNet [52], to enable diverse controls on the generated videos at no extra training cost. " My intuition + the above statement has left me to assume that one does not need to re-finetune a pretrained T2I-Adapter (that's already been trained on Stable Diffusion 1.5 for example) to control a Stable Diffusion 1.5 model that's been modified and fine-tuned as per. the Tune-A-Video paper to achieve the kind of results displayed below (coherent pose-guided imagery):.
Is this assumption correct?
Thanks again.
yes, there's no need to fine-tune the adapter (control) part. simply focus on fine-tuning the SD1.5 as Tune-A-Video.
@zhangjiewu Hi. Amazing work. Is it possible to provide example code on how can we use control mechanism (adapter as is mentioned in the paper) with your model? It is not clear how we can connect the pretrained T2I with pose control to Tune A video.
Hello, I was wondering how exactly you guys managed to perform "pose control" with Tune-A-Video? To my knowledge, the process hasn't been outlined in the Tune-A-Video paper.