johndpope / SPEAK-hack

Using Claude Sonnet to reverse engineer paper Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation
https://arxiv.org/pdf/2405.07257
7 stars 0 forks source link

GAN discriminator fails to converge after 20 epochs. #8

Open johndpope opened 3 months ago

johndpope commented 3 months ago

Screenshot from 2024-07-06 04-35-36

the paper says they use stylegan architecture

that means refactoring to this (unofficial version) https://raw.githubusercontent.com/tomguluson92/StyleGAN_PyTorch/4fd0711f560b9b080fca3df2448822835226ba02/networks_stylegan.py

or upgrade to stylegan2-ada (official) https://github.com/NVlabs/stylegan2-ada-pytorch/blob/main/training/networks.py

johndpope commented 3 months ago

i have a new stylegan2 architecture here https://github.com/johndpope/IMF maybe i can circle back to redo this one.