SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
https://arxiv.org/abs/2410.06885
MIT License
7.36k stars 886 forks source link

Size mismatch for ema_model.transformer.text_embed.text_embed.weight #449

Closed cod3r0k closed 1 week ago

cod3r0k commented 1 week ago

Checks

Environment Details

pytorch

Steps to Reproduce

  1. prepare dataset (csv custom)
  2. python train.py >>> size mismatch for ema_model.transformer.text_embed.text_embed.weight: copying a param with shape torch.Size([2546, 512]) from checkpoint, the shape in current model is torch.Size([94, 512]).

✔️ Expected Behavior

training must be started

❌ Actual Behavior

why size mismatch for ema_model.transformer.text_embed.text_embed.weight: copying a param with shape torch.Size([2546, 512]) from checkpoint, the shape in current model is torch.Size([94, 512]).