Open meyerdav opened 2 years ago
There is a published paper by Adam Polyak that touches on this subject called TTS Skins, available at https://arxiv.org/abs/1904.08983 though I believe the model's code hasn't been published alongside it.
On Sun, Apr 24, 2022 at 11:10 PM meyerdav @.***> wrote:
Is it realistic to apply your model for voice conversion? If not, what are your concerns?
— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/music-translation/issues/21, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE7VFHT2MPBRB25WN4RGKLVGWTCJANCNFSM5UGW3S6A . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks a lot! I will have a look at it! Do you think that interpreting a speakers voice as a music domain and directly applying your model could also work?
The paper does just that as a baseline system but shows significantly enhanced performance by taking into account how voice works.
On Thu, Apr 28, 2022 at 1:59 PM meyerdav @.***> wrote:
Thanks a lot! I will have a look at it! Do you think that interpreting a speakers voice as a music domain and directly applying your model could also work?
— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/music-translation/issues/21#issuecomment-1112066830, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE7VFBK7TZSGHTHN3KLXZDVHJVPJANCNFSM5UGW3S6A . You are receiving this because you commented.Message ID: @.***>
Is it realistic to apply your model for voice conversion? If not, what are your concerns?