2021-11-06: I have just updated the code structure to make it easier to understand. It may have potential bug now. I will do some test training later.
2021-11-01: I will update the code and make it easier to use later.
VoiceFixer is a framework for general speech restoration. We aim at the restoration of severely degraded speech and historical speech.
# Download dataset and prepare running environment
git clone https://github.com/haoheliu/voicefixer_main.git
cd voicefixer_main
source init.sh
Here we take VF_UNet(voicefixer with unet as analysis module) as an example.
Training
# pass in a configuration file to the training script
python3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json # you can modify the configuration file to personalize your training
You can checkout the logs directory for checkpoints, logging and validation results.
Evaluation
Automatic evaluation and generating .csv file on all testsets.
For example, if you like to evaluate on all testset (default).
python3 eval_gsr_voicefixer.py \
--config <path-to-the-config-file> \
--ckpt <path-to-the-checkpoint>
For example, if you just wanna evaluate on GSR testset.
python3 eval_gsr_voicefixer.py
--config <path-to-the-config-file> \
--ckpt <path-to-the-checkpoint> \
--testset general_speech_restoration \
--description general_speech_restoration_eval
There are generally seven testsets you can pass to --testset:
And if you would like to evaluate on a small portion of data, e.g. 10 utterance. You can pass the number to --limit_numbers argument.
python3 eval_gsr_voicefixer.py \
--config <path-to-the-config-file> \
--ckpt <path-to-the-checkpoint> \
--limit_numbers 10
Evaluation results will be presented in the exp_results folder.
Training
# pass in a configuration file to the training script
python3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json
You can checkout the logs directory for checkpoints, logging and validation results.
Evaluation (similar to voicefixer evaluation)
python3 eval_ssr_unet.py
--config <path-to-the-config-file> \
--ckpt <path-to-the-checkpoint> \
--limit_numbers <int-test-only-on-a-few-utterance> \
--testset <the-testset-you-want-to-use> \
--description <describe-this-test>
Training
Denoising
# pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_denoising.json
Dereverberation
# pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_dereverberation.json
Super Resolution
# pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_super_resolution.json
Declipping
# pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_declipping.json
You can checkout the logs directory for checkpoints, logging and validation results.
python3 eval_ssr_unet.py
--config <path-to-the-config-file> \
--ckpt <path-to-the-checkpoint> \
--limit_numbers <int-test-only-on-a-few-utterance> \
--testset <the-testset-you-want-to-use> \
--description <describe-this-test>
@misc{liu2021voicefixer,
title={VoiceFixer: Toward General Speech Restoration With Neural Vocoder},
author={Haohe Liu and Qiuqiang Kong and Qiao Tian and Yan Zhao and DeLiang Wang and Chuanzeng Huang and Yuxuan Wang},
year={2021},
eprint={2109.13731},
archivePrefix={arXiv},
primaryClass={cs.SD}
}