Image-video joint training

magic-research / magic-animate

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

https://showlab.github.io/magicanimate/

BSD 3-Clause "New" or "Revised" License

10.5k stars 1.08k forks source link

Image-video joint training #156

Closed Lanhaoran closed 7 months ago

Lanhaoran commented 7 months ago

Hello! I noticed that the image-video joint training strategy was used in the training stage and the images are from the LAION-400M dataset as mentioned in the paper . I would like to know how you selected human pictures as training data from LAION-400M dataset during the training process? And what is the approximate amount of the image data? Thank you!

zcxu-eric commented 7 months ago

Hi, we used an internal version selected by other colleagues and the total about is about 1 million. We actually find that LAION can improve the clothes details but will not affect the overall results too much, you can start training without it.

Lanhaoran commented 7 months ago

Thanks for your response. I am also interested to know if there are any plans to release this portion of the dataset？

zcxu-eric commented 7 months ago

We have no plan for the release because this dataset is built on top of some internal libraries.