lucidrains / e2-tts-pytorch

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
MIT License
228 stars 21 forks source link

sigma #31

Closed skirdey closed 1 week ago

skirdey commented 1 week ago

https://github.com/lucidrains/e2-tts-pytorch/blob/9d5fc1b4fe6e0fecd0e5e43681be0c6d2d1732ec/train_example.py#L32-L39

Should we have default value of sigma = 0.1 for the training case? It is not explicitly mentioned in the paper, but it seems like a good candidate to add a bit of variance during training.

JingRH commented 1 week ago

https://github.com/lucidrains/e2-tts-pytorch/blob/9d5fc1b4fe6e0fecd0e5e43681be0c6d2d1732ec/train_example.py#L32-L39

Should we have default value of sigma = 0.1 for the training case? It is not explicitly mentioned in the paper, but it seems like a good candidate to add a bit of variance during training.

In many flow matching networks, sigma is actually set to zero.

lucidrains commented 1 week ago

@skirdey seeing anything on your end?

skirdey commented 1 week ago

nothing specific yet, just a thought, i am trying it on a few smaller runs atm

lucidrains commented 1 week ago

@skirdey oh strange, thought you'd hit a positive result by now

lucidrains commented 1 week ago

@skirdey maybe start with the dataset Lucas used, and once you see what he sees, then titrate up to your large multilingual one?