yl4579 / StarGANv2-VC

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
MIT License
466 stars 110 forks source link

How to deal with using non training data for inference, and the inference results are not realistic enough to restore #98

Open Nanshanelectrician opened 6 months ago

Nanshanelectrician commented 6 months ago

How to deal with using non training data for inference, and the inference results are not realistic enough to restore

yl4579 commented 6 months ago

Can you be more specific? Do you mean unseen speakers? Unseen samples? What kind of input that is not in training data?

Nanshanelectrician commented 6 months ago

I use two of my own audio files, hoping that A will imitate what B says. The output result obtained does not seem to be what A said, so it feels unreal