insunhwang89 / StyleVC

MIT License
30 stars 3 forks source link

StyleVC - PyTorch official implementation

This is a Pytorch implementation of StyleVC StyleVC: Non-Parallel Voice Conversion with Adversarial Style Generalization. Feel free to use and modify the code and please refer our repo.



Updates



Demo Samples

Audio samples generated by this implementation can be found here.



Quick Start

You can quickly run model using Google Collab

Run the 'inference.ipynb' file in Collab! here



Install Dependencies

(Option) You can make an environment using anaconda

conda create -n py37torch17 python=3.7.9

(Option) And then activate your conda environment and install PyTorch and Tensorflow

conda activate py37torch17
conda install pytorch=1.7 torchvision torchaudio cudatoolkit=10.1 -c pytorch
pip install --upgrade tensorflow-gpu==1.15

You can install the python dependencies with

pip install -r requirements.txt



Train Your Model

Datasets

Preprocessing is supported for VCTK Datasets.



Preprocessing

You can refer to the sample file and the file structure below on Github. For preprocessing, use the following command.

python prepare_dataset.py --in_dir data/VCTK/original/ --out_dir_name VCTK_16K --dataset VCTK

The file structure after preprocessing is as follows:

├── data
│   ├── VCTK
│   │   ├── original    
│   │   │   ├── wav48
│   │   │   │   ├── wavs
│   │   │   ├── metadata.csv
│   │   ├── VCTK22K   
│   │   │   ├── train
│   │   │   │   ├── p225
│   │   │   │   │   ├── p225_021.npz
│   │   │   │   │   ├── ...
│   │   │   │   │   ├── p225_423.npz
│   │   │   │   ├── ...
│   │   │   │   ├── p376
│   │   │   ├── val



Train

To train, set hyperparameters in model/hparams.py and use the command.

python trainer.py --dataset VCTK --dataset_name VCTK_16K --log_dir StyleVC_VCTK_test01



Vocoder

We used Hifigan finetuned. You can download the checkpoint and config file below and saved in 'vocoder/checkpoint'.

Model Checkpoint file Config file
VCTK Download Download



Inference

python inference.py



Checkpoint

We provide pretrained checkpoint. Download the checkpoint file below and put it in 'outputs/StyleVC_VCTK'. Model Checkpoint file
VCTK Download



Citation

Please cite the paper if you find StyleVC useful.

@inproceedings{hwang2022stylevc,
  title={StyleVC: Non-Parallel Voice Conversion with Adversarial Style Generalization},
  author={Hwang, In-Sun and Lee, Sang-Hoon and Lee, Seong-Whan},
  booktitle={2022 26th International Conference on Pattern Recognition (ICPR)},
  pages={23--30},
  year={2022},
  organization={IEEE}
}