flrngel / understanding-ai

personal repository
36 stars 6 forks source link

Neural Voice Cloning with a Few Samples #8

Open flrngel opened 6 years ago

flrngel commented 6 years ago

https://arxiv.org/abs/1802.06006 Paper from Baidu Research

Abstract

Paper will do

1. Introduction

2. Voice Cloning

image

Paper Notations

2.1. Speaker adaption

Speaker adaption function

image

2.2. Speaker encoding

Speaker encoding function

image Paper avoids mode collapse with training speaker encoder seperately

Loss function (L1)

image

Architecture

image

2.3. Discriminative models for evaluation

Because human is so expensive, paper propose those two solutions for evaluation

2.3.1. Speaker Classification

2.3.2. Speaker Verification

Experiments

3.1. Datasets