YanzuoLu / CFLD

[CVPR 2024 Highlight] Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis
MIT License
165 stars 11 forks source link

Pose2Video #4

Closed zhou-linpeng closed 6 months ago

zhou-linpeng commented 6 months ago

Hi, Thanks for releasing this amazing program! Do you have any plans to expand UNET from 2D to 3D and realize pose to video generation? like animate-anyone

YanzuoLu commented 6 months ago

It would be an interesting idea to expand the current approach into 3D or video generation. To be honest, we are still in a very preliminary stage that is researching in the datasets and data volumes that need to be used, etc. You may also know that the training dataset of animate anyone is not public 😢 We are worried that the current model capacity (this should be fine since we can use SDXL) and data are not enough to support better results. Nevertheless, we welcome the exchange of more ideas. Thanks for your attention to our work.