Request on details of the data augmentation module

hayeong0 / DDDM-VC

Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)

https://hayeong0.github.io/DDDM-VC-demo/

160 stars 18 forks source link

Request on details of the data augmentation module #2

Closed kur114 closed 6 months ago

kur114 commented 6 months ago

Hello, I am in the process of training a model and find myself uncertain about the specifics of the data augmentation module. Here are two questions:

Is the module implemented using this code: https://github.com/revsic/torch-nansypp/blob/main/utils/augment?
Could you specify which hyperparameters in your configuration are passed to the augmentation module?

Thank you for your reply!

hayeong0 commented 6 months ago

Hello, thank you for showing interest in our work. I have updated the augmentation code that I used for this work, so please refer to it:

https://github.com/hayeong0/DDDM-VC/tree/master/augmentation

We also used the hyperparameter ratio proposed by NANSY, and observed that other values yielded worse or similar results, so we used the same hyperparameter. Specifically, it is as follows:

Formant shifting: U(1, 1.4)
Pitch randomization: U(1, 2)
random frequency shaping: (1, 1.5)

kur114 commented 6 months ago

Thank you very much for your detailed and helpful response. Your detailed implementation code has significantly clarified my understanding and will greatly assist me in my model training efforts.