'Voice Conversion' paper candidate 2212.14227

github-actions[bot] commented 1 year ago

Please check whether this paper is about 'Voice Conversion' or not.

article info.

title: StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models
summary: One-shot voice conversion (VC) aims to convert speech from any source speaker to an arbitrary target speaker with only a few seconds of reference speech from the target speaker. This relies heavily on disentangling the speaker's identity and speech content, a task that still remains challenging. Here, we propose a novel approach to learning disentangled speech representation by transfer learning from style-based text-to-speech (TTS) models. With cycle consistent and adversarial training, the style-based TTS models can perform transcription-guided one-shot VC with high fidelity and similarity. By learning an additional mel-spectrogram encoder through a teacher-student knowledge transfer and novel data augmentation scheme, our approach results in disentangled speech representation without needing the input text. The subjective evaluation shows that our approach can significantly outperform the previous state-of-the-art one-shot voice conversion models in both naturalness and similarity.
id: http://arxiv.org/abs/2212.14227v1

judge

Write [vclab::confirmed] or [vclab::excluded] in comment.

Ryu1845 commented 1 year ago

[vclab::confirmed]

github-actions[bot] commented 1 year ago

Thunk you very much for contribution! Your judgement is refrected in arXivSearches.json, and is going to be used for VCLab's activity. Thunk you so much.

tarepan / VoiceConversionLab

'Voice Conversion' paper candidate 2212.14227 #438

article info.

judge