TIGER-AI-Lab / AnyV2V

Code and data for "AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks" (TMLR 2024)
https://tiger-ai-lab.github.io/AnyV2V/
MIT License
508 stars 38 forks source link

Add Replicate demo and API #1

Closed chenxwh closed 7 months ago

chenxwh commented 7 months ago

Hi @vinesmsuic @lim142857 @wren93 ,

Very cool project on AnyV2V!

This pull request makes it possible to run AnyV2V on Replicate (https://replicate.com/cjwbw/AnyV2V) and via API (https://replicate.com/cjwbw/AnyV2V/api). Currently, the demo includes prompt-based video editing. Also, we'd like to transfer the demo page/redirect to TIGER-AI-Lab so you can make modifications easily, and happy to help maintain/integrate the upcoming changes :)

vinesmsuic commented 7 months ago

Thanks @chenxwh , will take a look into it

Max

lim142857 commented 7 months ago

@chenxwh Thank you for your hard work! I was wondering if we could add "ddim_init_latents_t_idx": 0 (default), "pnp_f_t": 1.0 (default), "pnp_spatial_attn_t": 1.0 (default), and "pnp_temp_attn_t": 1.0 (default) to the tweakable configs in the Replicate page.

chenxwh commented 7 months ago

@chenxwh Thank you for your hard work! I was wondering if we could add "ddim_init_latents_t_idx": 0 (default), "pnp_f_t": 1.0 (default), "pnp_spatial_attn_t": 1.0 (default), and "pnp_temp_attn_t": 1.0 (default) to the tweakable configs in the Replicate page.

Sure, happy to! Could you maybe provide some short description to those variables so I can add them to the demo too? I think it'll help people understand better how to set them. Thank you!

vinesmsuic commented 7 months ago

@chenxwh Thank you for your hard work! I was wondering if we could add "ddim_init_latents_t_idx": 0 (default), "pnp_f_t": 1.0 (default), "pnp_spatial_attn_t": 1.0 (default), and "pnp_temp_attn_t": 1.0 (default) to the tweakable configs in the Replicate page.

Sure, happy to! Could you maybe provide some short description to those variables so I can add them to the demo too? I think it'll help people understand better how to set them. Thank you!

chenxwh commented 7 months ago

thanks @vinesmsuic, I have added those to the demo now!

lim142857 commented 7 months ago

@chenxwh Thanks! Please checkout this updated config descriptions:

vinesmsuic commented 7 months ago

@chenxwh right, We found that 1.0 on the 3 pnp injection value works best for prompt-based editing on I2VGen-XL, so maybe its better to use 1.0 for the demo. Sorry for the confusion. Can you create another commit change?

chenxwh commented 7 months ago

Sure! The latest changes reflect the updated default value with detailed descriptions for those values. An updated example is also added to the demo.

chenxwh commented 7 months ago

Thanks for the merge! I have redirected the page to https://replicate.com/tiger-ai-lab/anyv2v and added you to the tiger-ai-lab org (https://replicate.com/tiger-ai-lab) so you have the authority to make any changes to the page! And always happy to help push updates :)

lim142857 commented 7 months ago

@chenxwh Thanks a lot for the contribution! Could you also add me to the tiger-ai-lab org(https://replicate.com/tiger-ai-lab) :)

chenxwh commented 7 months ago

Sure thing @lim142857! Just added you as well :D

vinesmsuic commented 7 months ago

Hi @chenxwh, I wonder if we can modify the demo to allow users to input their own edited_1st_frame to override instruction prompt if provided? It seems a lot of users want to try out with their own edited first frame instead of the instructpix2pix output.

chenxwh commented 7 months ago

sure I will make the changes later today :)

wenhuchen commented 7 months ago

I think letting people upload image probably causes too much overhead. People might need to visit other website to do it. It's a bit complex.

@chenxwh I'm wondering whether it's possible to breakdown the demo to two steps because the first-step result from instructpix2pix is not very stable. We can sweep several hparams (random seed, cfg params) to let instructpix2pix generate a few different images (They can even re-run this until they are happy with it). This should be quite cheap. Then a user can click on the most satisfied image to continue to do video generation. This will dramatically increase the success rate.

chenxwh commented 7 months ago

I think letting people upload image probably causes too much overhead. People might need to visit other website to do it. It's a bit complex.

@chenxwh I'm wondering whether it's possible to breakdown the demo to two steps because the first-step result from instructpix2pix is not very stable. We can sweep several hparams (random seed, cfg params) to let instructpix2pix generate a few different images (They can even re-run this until they are happy with it). This should be quite cheap. Then a user can click on the most satisfied image to continue to do video generation. This will dramatically increase the success rate.

The demo on the website only supports end of end inference. So I think the best way is to give option to use the default full pipeline or accept provided first frame that is obtained from the existing instructpix2pix model.

vinesmsuic commented 7 months ago

Hi @chenxwh, just discussed with @wenhuchen and we would love to stick to the original plan (modify the demo to allow users to input their own edited_1st_frame to override instruction prompt if provided). Really appreciate your help :)

chenxwh commented 7 months ago

A new version is pushed to Replicate now :) and opened another PR for the change