lucidrains / voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
MIT License
589 stars 49 forks source link

How to pick sigma? #46

Open ex3ndr opened 6 months ago

ex3ndr commented 6 months ago

Hey everyone, i am tryin to figure out what values of σ aka sigma is meant to be used during training? There are no mentioning of a specific value in papers for some reason.

zvorinji commented 6 months ago

The paper does mention 0.00001 as a starting point but unclear if that’s actually what they used. That said if testing out whether a different number could make the model converge faster, I’d go up (not down) and by orders of magnitude each test, and wouldn’t go above 1. So basically test 0.0001, if better than default, go test 0.001, and so on.

ex3ndr commented 6 months ago

Where did you get this number? I don't see it in the paper

Subuday commented 6 months ago

@ex3ndr Flow Matching paper recommends to set it sufficiently small. Guys from MatchaTTS(which also uses Flow Matching) trained their model with 1e-5 value.