Singing Voice Synthesis based on Diffusion Model

Goal

In this project, we design an lightweight singing voice synthesis model while maintaining the synthesis quality of the diffusion model.

The dataset used for this project can be found in the following directory:

U-net: Used for processing audio files.
mlp-singer: Deprecated.
Results: Output estimates are stored in /userHome/userhome2/dahyun/voice/Singing_Voice_Synthesis/U-net/outputs/estimates.

The csd datasetloader combines background music with voice in a nursery rhyme style, and seperate is used to isolate the voice.

The diffusion model is applied using the CSD dataset for voice synthesis.

For detailed implementation and results, refer to the notebook:

Inference notebook: /userHome/userhome2/dahyun/voice/Singing_Voice_Synthesis/U-net/test_and_separate.ipynb