Speaker-Embeddings
Implementation of Generalized End-to-end Loss for Speaker Verification - GE2E loss, which yields speaker embeddings as results.
This is a mere small project to practice re-producing paper, if you want a repository that actually re-produce the paper, please refer to Resemblyzer or the encoder module of Voice Cloning.
Posts of Reproducing ML papers
How the author of Resemblyzer implements GE2E loss
MultiReader technique
- The authors introduced the MultiReader technique to combine different data sources, enabling to train with multiple keywords (TD-SV) and multiple languages (TI-SV and TD-SV) and helps solving the limited training data problem.
Dataset
- VCTK is a large and sufficient multi-speaker dataset
- Mozilla Common Voice is a smaller multi-speaker dataset crowdsourced (Can be sufficient for prototyping)
- VIVOS is good multi-speaker VNese voice dataset