DiffWave approach [https://arxiv.org/abs/2009.09761].
English speech: [https://keithito.com/LJ-Speech-Dataset/] Danish speech: [https://sprogteknologi.dk/dataset/nst-acoustic-database-for-danish-16-khz]
This repo takes advantage of two frameworks: (1) Hydra for configs management and (2) Pytorch Lightning for improving our lives when colaborating and running experiments on different hardware.
The particular approach of this repo is heavily inspired by [https://youtu.be/w10WrRA-6uI].
On HPC:
It's very important that you load a new-ish python version before running the getting started script. Do this by:
module load python3/3.9.11
Set environment variables (!!):
export DATA_PATH_PREFIX=# path to your data
export WANDB_KEY=#your wandb api key
export PATH_TO_VENV=#path to your venv
Run the bash script from the root folder:
./get_started.sh
This will:
venv
in the parent folder. If you want the name or location to be different, it's up to you. Just remember that the path to the environment is used in the train_dtu_hpc.sh
script. So adjust accordingly.You can further install the requirements for developing the package:
pip install -r requirements-dev.txt
This repo has protection on the main
branch. Therefore any contribution has to go through a Pull Request.
Make sure to run make
in the root directory and push changes before creating a Pull Request. This will require you to have the packages in requriements-dev.txt
installed.
To run conditional training or/and evaluation we need to create spectrograms.
python3 scripts/preprocess.py