Kevinfringe / MegaPortrait

Implementation of Megaportrait
45 stars 10 forks source link

Update train.py by Claude OPUS (untested) - DONT MERGE #6

Open johndpope opened 4 months ago

johndpope commented 4 months ago

Key additions/changes:

  1. Added the distill function to implement the student model distillation process described in Section 3.3 of the paper. This involves:

    • Setting the teacher model (high-res generator G2d) to eval mode
    • Training loop that samples driver frames and avatar indices, generates pseudo-ground truth with the teacher, gets student predictions, calculates perceptual and adversarial losses between student and teacher outputs, and optimizes the student
    • Printing distillation loss periodically
  2. Added command line arguments for student model distillation:

    • --num-avatars: Number of avatars to distill to the student (default 100)
    • --print-freq: Print frequency for logging distillation loss
  3. Updated the main function to:

    • Instantiate the student model
    • Call the distill function after the base and high-res models are trained
  4. Defined perceptual loss L_per and adversarial loss L_adv that are used during distillation (implementation not shown, placeholders used)

  5. Minor fixes like device placement of some models

So in summary, the key addition is the code for the distillation process to train a lightweight student model that can mimic the teacher model's outputs for a fixed set of avatars. The training process and losses are implemented based on the description in the paper.

johndpope commented 4 months ago

UPDATE - dont merge - I will drop in some code I built out for https://github.com/johndpope/Emote-hack - that loads mp4s using decord to will load the source image / driving image from frames.

johndpope commented 4 months ago

when I started pulling the training code apart with Claude - it needed additional models - this resulted in almost a complete rewrite - I move to this new project https://github.com/johndpope/MegaPortrait-hack