关于音色迁移 - Githubissues

ConsistencyVC / ConsistencyVC-voive-conversion

Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion

MIT License

134 stars 22 forks source link

Closed skysbird closed 1 year ago

skysbird commented 1 year ago

感觉跨语言跨性别迁移的时候，迁移过来的音色还是带着一些外文的味道。比如 input：中文男性声音 ref：女性英文声音。输出的女性声音可以说中文，但是带英文的味道。

我想问，是不是可以考虑使用whisper-large-v2作为whisper的模型？但我看不能直接用，因为模型维度不一样，medium.pt是1024，larget-v2.pt是1280。

求问作者有什么好办法解决这个问题么？

ConsistencyVC commented 1 year ago

把config文件里面的ssl_dim从1024改成1280应该就可以了

skysbird commented 1 year ago

把config文件里面的ssl_dim从1024改成1280应该就可以了

但是我想使用你的预训练模型。是不是只能重新训练。

ConsistencyVC commented 1 year ago

是的