huawei-noah / Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
565 stars 119 forks source link
speech-processing speech-recognition speech-synthesis

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Grad-TTS

Official implementation of the Grad-TTS model based on Diffusion Probabilistic Modelling. For all details check out our paper accepted to ICML 2021 via this link.

Authors: Vadim Popov*, Ivan Vovk*, Vladimir Gogoryan, Tasnima Sadekova, Mikhail Kudinov.

*Equal contribution.

SPIRAL

Official implementation of SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training. For all details check out our paper accepted to ICLR 2022 via this link.

Authors: Wenyong Huang, Zhenhe Zhang, Yu Ting Yeung, Xin Jiang, Qun Liu.

DiffVC

Official implementation of the paper "Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme" (ICLR 2022, Oral). Link.

Authors: Vadim Popov, Ivan Vovk, Vladimir Gogoryan, Tasnima Sadekova, Mikhail Kudinov, Jiansheng Wei.