ex3ndr / supervoice-voicebox

VoiceBox neural network implementation
95 stars 9 forks source link

Drop the unmasked tokens #10

Open lixuyuan102 opened 6 months ago

lixuyuan102 commented 6 months ago

Nice work! May i ask the 0.9 probability of dropping unmasked tokens to condition on audio only is important? Could you share the detail of AB study?

ex3ndr commented 6 months ago

I didn't do any AB, i just wanted to be able to try conditioning without alignments. But since alignments improves subjectively performance there are no need to do in practice.