johndpope / MegaPortrait-hack

Using Claude Opus to reverse engineer code from MegaPortraits: One-shot Megapixel Neural Head Avatars
https://arxiv.org/abs/2207.07621
42 stars 7 forks source link

DRAFT - training overhaul - wip #21

Closed johndpope closed 3 weeks ago

johndpope commented 4 weeks ago

contrastive loss is blowing up.

johndpope commented 3 weeks ago

@kwentar / @jackailab / @Jie-zju / @robinchm / @flyingshan https://github.com/johndpope/MegaPortrait-hack/issues/14

code is SLOWLY training on my local 3090 gpu - 512x512 - i didn't test inference yet. Screenshot from 2024-05-30 12-23-22

to run training with 256x256 - i had ripped out the avgpool - or maybe a cleaner way.... https://github.com/johndpope/MegaPortrait-hack/blob/main/model.py#L254

output_frame_93

dont really want to burn out my gpu - but there's a hq torrent which we could use to train in the cloud.

UPDATE - i think it just blew up using too much vram. going to set the save interval to 100 (the paper uses 200,000)