MStypulkowski / diffused-heads

Official repository for Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation
Other
463 stars 31 forks source link

About the released date of training script #21

Closed kaiw7 closed 6 months ago

kaiw7 commented 7 months ago

Hi, when do you plan to release your training script? Thank you very much.

johndpope commented 6 months ago

I think I can get Claude3 to spit this out - give me a few days. should be able to just upload the paper - and all the respective code. maybe a schema dump of checkpoint - but presumably its' just a vanillla stable diffusion. I'm attempting to recreate the Emote paper - https://github.com/johndpope/emote-hack and I'll need this training code.

UPDATE so digging into the model - it doesn't look like stable diffusion under the hood. Was hoping I could decipher the architecture under the hood - but it's all wrapped up in torchscript. models are not aligning - gods knows how this model was prepared. this is the crema_script.pt

RecursiveScriptModule(
  original_name=UNet
  (time_embed): RecursiveScriptModule(
    original_name=TimestepEmbedding
    (fc): RecursiveScriptModule(
      original_name=Sequential
      (0): RecursiveScriptModule(original_name=Linear)
      (1): RecursiveScriptModule(original_name=SiLU)
      (2): RecursiveScriptModule(original_name=Linear)
    )
  )
  (audio_embed): RecursiveScriptModule(
    original_name=Sequential
    (0): RecursiveScriptModule(original_name=Linear)
    (1): RecursiveScriptModule(original_name=SiLU)
    (2): RecursiveScriptModule(original_name=Linear)
  )
  (input_blocks): RecursiveScriptModule(
    original_name=ModuleList
    (0): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(original_name=Conv2d)
    )
    (1): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Identity)
      )
    )
    (2): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Identity)
      )
    )
    (3): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(
          original_name=Downsample
          (avg_pool): RecursiveScriptModule(original_name=AvgPool2d)
        )
        (x_upd): RecursiveScriptModule(
          original_name=Downsample
          (avg_pool): RecursiveScriptModule(original_name=AvgPool2d)
        )
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Identity)
      )
    )
    (4): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
    )
    (5): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Identity)
      )
    )
    (6): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(
          original_name=Downsample
          (avg_pool): RecursiveScriptModule(original_name=AvgPool2d)
        )
        (x_upd): RecursiveScriptModule(
          original_name=Downsample
          (avg_pool): RecursiveScriptModule(original_name=AvgPool2d)
        )
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Identity)
      )
    )
    (7): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
      (1): RecursiveScriptModule(
        original_name=AttentionBlock
        (norm): RecursiveScriptModule(original_name=GroupNorm)
        (qkv): RecursiveScriptModule(original_name=Conv1d)
        (attention): RecursiveScriptModule(original_name=QKVAttentionLegacy)
        (proj_out): RecursiveScriptModule(original_name=Conv1d)
      )
    )
    (8): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Identity)
      )
      (1): RecursiveScriptModule(
        original_name=AttentionBlock
        (norm): RecursiveScriptModule(original_name=GroupNorm)
        (qkv): RecursiveScriptModule(original_name=Conv1d)
        (attention): RecursiveScriptModule(original_name=QKVAttentionLegacy)
        (proj_out): RecursiveScriptModule(original_name=Conv1d)
      )
    )
  )
  (middle_block): RecursiveScriptModule(
    original_name=CondSequential
    (0): RecursiveScriptModule(
      original_name=CondResBlock
      (in_layers): RecursiveScriptModule(
        original_name=Sequential
        (0): RecursiveScriptModule(original_name=GroupNorm)
        (1): RecursiveScriptModule(original_name=SiLU)
        (2): RecursiveScriptModule(original_name=Conv2d)
      )
      (h_upd): RecursiveScriptModule(original_name=Identity)
      (x_upd): RecursiveScriptModule(original_name=Identity)
      (t_emb_layers): RecursiveScriptModule(
        original_name=Sequential
        (0): RecursiveScriptModule(original_name=SiLU)
        (1): RecursiveScriptModule(original_name=Linear)
      )
      (audio_emb_layers): RecursiveScriptModule(
        original_name=Sequential
        (0): RecursiveScriptModule(original_name=SiLU)
        (1): RecursiveScriptModule(original_name=Linear)
      )
      (out_layers): RecursiveScriptModule(
        original_name=Sequential
        (0): RecursiveScriptModule(original_name=GroupNorm)
        (1): RecursiveScriptModule(original_name=SiLU)
        (2): RecursiveScriptModule(original_name=Dropout)
        (3): RecursiveScriptModule(original_name=Conv2d)
      )
      (skip_connection): RecursiveScriptModule(original_name=Identity)
    )
    (1): RecursiveScriptModule(
      original_name=AttentionBlock
      (norm): RecursiveScriptModule(original_name=GroupNorm)
      (qkv): RecursiveScriptModule(original_name=Conv1d)
      (attention): RecursiveScriptModule(original_name=QKVAttentionLegacy)
      (proj_out): RecursiveScriptModule(original_name=Conv1d)
    )
    (2): RecursiveScriptModule(
      original_name=CondResBlock
      (in_layers): RecursiveScriptModule(
        original_name=Sequential
        (0): RecursiveScriptModule(original_name=GroupNorm)
        (1): RecursiveScriptModule(original_name=SiLU)
        (2): RecursiveScriptModule(original_name=Conv2d)
      )
      (h_upd): RecursiveScriptModule(original_name=Identity)
      (x_upd): RecursiveScriptModule(original_name=Identity)
      (t_emb_layers): RecursiveScriptModule(
        original_name=Sequential
        (0): RecursiveScriptModule(original_name=SiLU)
        (1): RecursiveScriptModule(original_name=Linear)
      )
      (audio_emb_layers): RecursiveScriptModule(
        original_name=Sequential
        (0): RecursiveScriptModule(original_name=SiLU)
        (1): RecursiveScriptModule(original_name=Linear)
      )
      (out_layers): RecursiveScriptModule(
        original_name=Sequential
        (0): RecursiveScriptModule(original_name=GroupNorm)
        (1): RecursiveScriptModule(original_name=SiLU)
        (2): RecursiveScriptModule(original_name=Dropout)
        (3): RecursiveScriptModule(original_name=Conv2d)
      )
      (skip_connection): RecursiveScriptModule(original_name=Identity)
    )
  )
  (output_blocks): RecursiveScriptModule(
    original_name=ModuleList
    (0): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
      (1): RecursiveScriptModule(
        original_name=AttentionBlock
        (norm): RecursiveScriptModule(original_name=GroupNorm)
        (qkv): RecursiveScriptModule(original_name=Conv1d)
        (attention): RecursiveScriptModule(original_name=QKVAttentionLegacy)
        (proj_out): RecursiveScriptModule(original_name=Conv1d)
      )
    )
    (1): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
      (1): RecursiveScriptModule(
        original_name=AttentionBlock
        (norm): RecursiveScriptModule(original_name=GroupNorm)
        (qkv): RecursiveScriptModule(original_name=Conv1d)
        (attention): RecursiveScriptModule(original_name=QKVAttentionLegacy)
        (proj_out): RecursiveScriptModule(original_name=Conv1d)
      )
    )
    (2): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
      (1): RecursiveScriptModule(
        original_name=AttentionBlock
        (norm): RecursiveScriptModule(original_name=GroupNorm)
        (qkv): RecursiveScriptModule(original_name=Conv1d)
        (attention): RecursiveScriptModule(original_name=QKVAttentionLegacy)
        (proj_out): RecursiveScriptModule(original_name=Conv1d)
      )
      (2): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Upsample)
        (x_upd): RecursiveScriptModule(original_name=Upsample)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Identity)
      )
    )
    (3): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
    )
    (4): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
    )
    (5): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
      (1): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Upsample)
        (x_upd): RecursiveScriptModule(original_name=Upsample)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Identity)
      )
    )
    (6): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
    )
    (7): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
    )
    (8): RecursiveScriptModule(
      original_name=CondSequential
      (0): RecursiveScriptModule(
        original_name=CondResBlock
        (in_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Conv2d)
        )
        (h_upd): RecursiveScriptModule(original_name=Identity)
        (x_upd): RecursiveScriptModule(original_name=Identity)
        (t_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (audio_emb_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=SiLU)
          (1): RecursiveScriptModule(original_name=Linear)
        )
        (out_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=GroupNorm)
          (1): RecursiveScriptModule(original_name=SiLU)
          (2): RecursiveScriptModule(original_name=Dropout)
          (3): RecursiveScriptModule(original_name=Conv2d)
        )
        (skip_connection): RecursiveScriptModule(original_name=Conv2d)
      )
    )
  )
  (out): RecursiveScriptModule(
    original_name=Sequential
    (0): RecursiveScriptModule(original_name=GroupNorm)
    (1): RecursiveScriptModule(original_name=SiLU)
    (2): RecursiveScriptModule(original_name=Conv2d)
  )
)

UPDAte - there's training code for this https://github.com/Zejun-Yang/AniPortrait