Closed MingjieChen closed 2 years ago
@MingjieChen I think it's for data augmentation. Check: https://zhuanlan.zhihu.com/p/378269256 (Balanced Consistency Regularization (bCR) )
It is important only for datasets with undesirable sound quality. For example, you don't need it for the JVS dataset (Japanese) because it was recorded in a noise-free studio, but VCTK on the other hand was recorded in the speaker's home. This is to make sure the model does not overfit the background noise.
Hello,
I found the consistency regularization is quite important for the training process, however you did not mention it in the paper. Could you please give some reason about why using it?
Thanks