auspicious3000 / SpeechSplit

Unsupervised Speech Decomposition Via Triple Information Bottleneck
http://arxiv.org/abs/2004.11284
MIT License
636 stars 92 forks source link

How to synthesis a speech which I need? #63

Open sanena opened 2 years ago

sanena commented 2 years ago

Hello!It’s not long for me since I just learned voice conversion.So I have many questions, and one of which is how to appoint the source speech and the target speech so I can synthesis the speech which I need. In short, Could you tell me how to restructure the demo.pkl? If you can answer me ,I'll be very grateful!

HJYblur commented 2 years ago

Actually,I believe the structure of demo.pkl is:

-sbmt[0](the data of speaker1) --name of the speaker,e.g. P226 --one-hot vector of the speaker --four components ---x0(the mel spectrum of the wav) ---f0(symbolize as pitch) ---length of x0 or f0 ---a number I don't know the meaning

-sbmt[1](the same as above) .... You can print them out to see the details.