Open rakesh-reddy95 opened 1 year ago
i am also focusing on this, maybe we need to selelct the specifc noise in the begining of the inference process. Please let each other know if there is an answer.
You can refer to make-a-protagonist (https://arxiv.org/abs/2305.08850).
@zhangjiewu Currently tune-a-video is a text guided video editing. May I know how we can apply this for Text+Image guided video editing. I just want to replace the subject in a video from the given photo and give the text prompt for action? Can you please provide if you come across in your research?