Open huiyan1804 opened 1 week ago
There is a related pull request movie_editor.py
in https://github.com/jy0205/Pyramid-Flow/pull/112 for the video extension task. Although it does not implement video-to-video generation as expected, we believe the task itself should be achievable within our framework.
There is a related pull request
movie_editor.py
in #112 for the video extension task. Although it does not implement video-to-video generation as expected, we believe the task itself should be achievable within our framework.
im thinking use the same idea as img2img on video2video. firstly i encode the frames into latent space, add noise according to the scheduler's timestep. 8 frames in a unit, same shape as the latents to be denoising. then i've tried directly replace the noisy latents or add on it,either result bad. i find it difficult to handle the jump points between different resolution. img2img task usually need denoising strength within 0-1, in your original pipeline, noisy latent equals to denoising strength=1,which means not refer to input video. when i set strength <1, for example 0.4, denoising only take the last 40% timesteps. if i set this to all 3 stages, it cuts off the continuity between stages. and if i only perform 0.4 strength in the first low res stage, 1 in the rest 2 stages, the influence of the input video is to week. any good ideas?
To apply our framework to video-to-video generation, you need to downsample and renoise the latent to match the training latent at a certain timestep, and then start inference from that timestep.
t2v and i2v are both work well, thanks for your work! is there any chance to implement Pyramin Flow on v2v task?