auspicious3000 / SpeechSplit

Unsupervised Speech Decomposition Via Triple Information Bottleneck
http://arxiv.org/abs/2004.11284
MIT License
636 stars 92 forks source link

Does it will work on unseen data also? will it be able to convert voice of unseen speaker with different content than that of data in training, will we obtain the disentanglement? #27

Open pycodebook opened 3 years ago

OSSome01 commented 3 years ago

I have the same doubt, can someone please clarify. Thanks in advance.

zhouyong64 commented 2 years ago

I have the same question.

auspicious3000 commented 2 years ago

You can make it generalize to unseen speakers by training it the same way as AutoVC.

skol101 commented 2 years ago

@auspicious3000 could you explain what you mean by 'training it the same way as AutoVC"?

Repeat all steps from here https://github.com/auspicious3000/autovc#2train-model ?

Or change make_metadata.py in SpeechSplit to embed speaker encodings but train using model from SpeechSplit?

auspicious3000 commented 2 years ago

@skol101 it means training with generalized speaker embeddings instead of one-hot embeddings

skol101 commented 2 years ago

I did that -- used make_metadata.py from AutoVC. Now I have removed validation part from the solver.py in this repo, because there's nothing to validate it against (as in solver_encoder.py in AutoVC) and started training.

Am I doing this correctly? Your help is very appreciated.

auspicious3000 commented 2 years ago

sounds correct, but you don't need to remove the validation part

skol101 commented 2 years ago

I had to remove validation part because I couldn’t figure out yet how to create my validatation pkl  file yet based on vctk plus my custom voices. On 7 Feb 2022, 21:07 +0200, Kaizhi Qian @.***>, wrote:

sounds correct, but you don't need to remove the validation part — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>