Please check whether this paper is about 'Voice Conversion' or not.
article info.
title: Voice Conversion-based Privacy through Adversarial Information Hiding
summary: Privacy-preserving voice conversion aims to remove only the attributes of
speech audio that convey identity information, keeping other speech
characteristics intact. This paper presents a mechanism for privacy-preserving
voice conversion that allows controlling the leakage of identity-bearing
information using adversarial information hiding. This enables a deliberate
trade-off between maintaining source-speech characteristics and modification of
speaker identity. As such, the approach improves on voice-conversion techniques
like CycleGAN and StarGAN, which were not designed for privacy, meaning that
converted speech may leak personal information in unpredictable ways. Our
approach is also more flexible than ASR-TTS voice conversion pipelines, which
by design discard all prosodic information linked to textual content.
Evaluations show that the proposed system successfully modifies perceived
speaker identity whilst well maintaining source lexical content.
Please check whether this paper is about 'Voice Conversion' or not.
article info.
title: Voice Conversion-based Privacy through Adversarial Information Hiding
summary: Privacy-preserving voice conversion aims to remove only the attributes of speech audio that convey identity information, keeping other speech characteristics intact. This paper presents a mechanism for privacy-preserving voice conversion that allows controlling the leakage of identity-bearing information using adversarial information hiding. This enables a deliberate trade-off between maintaining source-speech characteristics and modification of speaker identity. As such, the approach improves on voice-conversion techniques like CycleGAN and StarGAN, which were not designed for privacy, meaning that converted speech may leak personal information in unpredictable ways. Our approach is also more flexible than ASR-TTS voice conversion pipelines, which by design discard all prosodic information linked to textual content. Evaluations show that the proposed system successfully modifies perceived speaker identity whilst well maintaining source lexical content.
id: http://arxiv.org/abs/2409.14919v1
judge
Write [vclab::confirmed] or [vclab::excluded] in comment.