johndpope / MegaPortrait-hack

Using Claude Opus to reverse engineer code from MegaPortraits: One-shot Megapixel Neural Head Avatars
https://arxiv.org/abs/2207.07621
42 stars 7 forks source link

experiment on reconstructing the input image #30

Closed flyingshan closed 3 weeks ago

flyingshan commented 3 weeks ago

I'm trying to conduct an experiment on training to reconstruct the input image using only Eapp + G3d + G2d. I use two videos as the training set. But I found the model could not overfit to the input image, losing high frequency information. I wonder had anyone trained a model to generate images with the same clearness as the input image using the network structure depicted in the paper?

johndpope commented 3 weeks ago

It could be many things - I gave up after 10 epochs don’t want to burn out my old 3090. The losses are critical - there’s new Pr that adjusts thing to the driving image. (Maybe wrong too) how many epochs did you get to? It’s so slow.

flyingshan commented 3 weeks ago

I only use vgg19 perceptual and vggface2 loss in this experiment. I train about 30k iteration for this two videos. It takes about 0.5s per iteration(A100 GPU, 256 resolution).

johndpope commented 3 weeks ago

This code is supporting 512 only at the moment

If you look at training code it has he verbatim implementation of losses - nothing added - only the gaze loss is removed.

My PR bypasses the crop and warp

im looking to rearchitect this pipeline next week using some novel approaches.

Training for base model has to go to 200,000 iterations but if it deviates from paper maybe bin job.

johndpope commented 3 weeks ago

currently rebuilding the cycle consistency loss function. will competely overhaul training.