lucidrains / voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
MIT License
589 stars 49 forks source link

Default assignment of "cond". #51

Open WECarol opened 2 months ago

WECarol commented 2 months ago

Thank you for all your efforts! I have a question about the codes :->. I wonder why is "cond" defaulted to "target" here in the forward() function of the VoiceBox class. cond = default(cond, target) It seems that x, cond and cond tokens in this implementation respectively correspond to w, xctx and z in VoiceBox paper. But xctx should be masked from x1, rather than target=x1-(1-σ)x0, is that true? Some insights would be greatly appreciated.