NVIDIA / vid2vid

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.
Other
8.6k stars 1.2k forks source link

Pretrained Pose-to-body Model? #31

Open aman-tiwari opened 6 years ago

aman-tiwari commented 6 years ago

Hi, Thank you for this wonderful research — I was wondering if you were planning to release the pre-trained models for the pose to body translation task? If not, could you release the hyper-parameters used to train the models for that task? Thank you for any help!

tcwang0509 commented 6 years ago

I'm not sure we can release that model due to copyright issue. The training code is committed now, so you're welcome to try it yourself.

petergerten commented 6 years ago

@tcwang0509 It would be great if the model could be released (or maybe a new model with training data that can be used).

As most people won't have a DGX1 available, training the 2K resolution net for 10 days on a p3.16xlarge instance (also has 8 V100 GPUs) would cost USD 5875 on AWS. (USD24.48 per hour on-demand pricing 24 hours/day 10 days)

And it seems that this won't even work as you state at least 24GB memory is required per GPU. I am not even aware of any cloud provider currently offering GPUs with more than 16GB.

tcwang0509 commented 6 years ago

@petergerten the requirements you mentioned is for training Cityscapes. For pose even on single GPU it should just take 5-7 days.

petergerten commented 6 years ago

@tcwang0509 great, thanks for the clarification

ChenyuGao commented 6 years ago

The research is very exciting! I also hope the pretrained model will be released~

bube5h commented 5 years ago

@tcwang0509 For face how many days will it take?

therobotprogrammer commented 5 years ago

So I've found a pre-trained model used by densepose. https://dl.fbaipublicfiles.com/densepose/DensePose_ResNet101_FPN_s1x-e2e.pkl

Now, if only there was a way to convert this densepose model to a format used by vid2vid. Assuming both libraries use the same formatting to declare the input & output tensors, would this be possible?

I'm new to pytorch so please pardon the noob question.

kartikJ-9 commented 4 years ago

So I've found a pre-trained model used by densepose. https://dl.fbaipublicfiles.com/densepose/DensePose_ResNet101_FPN_s1x-e2e.pkl

Now, if only there was a way to convert this densepose model to a format used by vid2vid. Assuming both libraries use the same formatting to declare the input & output tensors, would this be possible?

I'm new to pytorch so please pardon the noob question.

Any update on this one. This seems a good approach to get the pretrained model for pose?