NVIDIA / vid2vid

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.
Other
8.57k stars 1.2k forks source link

Question on pose data #128

Open malcolmgarner-movement opened 4 years ago

malcolmgarner-movement commented 4 years ago

Loving this work, it is amazing what has been accomplished in this field of research!

I have a question about the body to pose segment: how does this actually work? Are two videos taken and the actions of one video are transposed onto the actor in the second video?

I've read through the documentation and so far I haven't been able to determine the basic mechanic behind this specific incarnation.

Any assistance would be appreciated, and if there is material I might have missed, please don't hesitate to point me in the right direction.

Thanks in advance

pranavraikote commented 3 years ago

@malcolmgarner-movement Based on given DensePose mask frames + Input frames + OpenPose files, you can give your choice of target images, and it should generate something similar. This is as far as my understanding, yet to test it thoroughly with any permutations and combinations.

git-hamza commented 3 years ago

@pranavraikote I don't think it mentions anywhere about the target images. Doesn't it says that it synthesizes on videos that was learned (training videos)?