johndpope / MegaPortrait-hack

Using Claude Opus to reverse engineer code from MegaPortraits: One-shot Megapixel Neural Head Avatars
https://arxiv.org/abs/2207.07621
82 stars 8 forks source link

IMPORTANT

My VASA hack project https://github.com/johndpope/vasa-1-hack has running /training code stage 1 (megaportraits) - with hot fixes https://github.com/johndpope/VASA-1-hack/blob/main/train_stage_1.py

MegaPortrait - SamsungLabs AI - Russia

Implementation of Megaportrait using Claude Opus

All models / code is in model.py

Image

memory debug

    mprof run train.py

or just

    python train.py

UPDATES

EmoDataset

warp / crop / spline / remove background / transforms

Training Data (☢️ dont need this yet.)

Training Strategy

for now - to simplify problem - use the 4 videos in junk folder. once models are validated - can point the video_dir to above torrent

 # video_dir:  '/Downloads/CelebV-HQ/celebvhq/35666'  
  video_dir: './junk'

the preprocessing is taking 1-2 mins for each video - I add some saving to npz format for faster reloading.

Torrent Download

You can download the dataset via the provided magnet link or by visiting Academic Torrents.

magnet:?xt=urn:btih:843b5adb0358124d388c4e9836654c246b988ff4&dn=CelebV-HQ&tr=https%3A%2F%2Facademictorrents.com%2Fannounce.php&tr=https%3A%2F%2Fipv6.academictorrents.com%2Fannounce.php

Implemented Functionality / Descriptions

Base Model (Gbase)

High-Resolution Model (GHR)

Student Model (Student)

Gaze and Blink Loss Model

Training Functions

Training Pipeline

Main Function

rome/losses - cherry picked from https://github.com/SamsungLabs/rome

wget 'https://download.pytorch.org/models/resnet18-5c106cde.pth' extract to state_dicts

RT-GENE (Real-Time Gaze Estimation) - couldn't get working

git clone https://github.com/Tobias-Fischer/rt_gene.git
cd rt_gene/rt_gene
pip install .