auspicious3000 / SpeechSplit

Unsupervised Speech Decomposition Via Triple Information Bottleneck
http://arxiv.org/abs/2004.11284
MIT License
636 stars 92 forks source link

Can it transfer RFU between different utterances? #44

Open flagman opened 3 years ago

flagman commented 3 years ago

Hi. Thank you for the fantastic project. Does your model is capable to transfer content, rhythm, and pitch between different sentences? I've prepared a demo.pkl file in the way that metadata[0] is metadata[0].wav.zip And the metadata[1] is was left the same.

Here is the result: p226_p231_003002_RFU.wav.zip

Did I do something wrong, or your model is not intended to do such conversions?

skol101 commented 2 years ago

@flagman how did you manage to prepare your custom demo.pkl?