Closed wizardk closed 1 year ago
Hi @wizardk, thank you for the question. I think you can train the model with the 16k sampling rate by modifying the sampling_rate in json file. For example, in the case of MS-iSTFT-VITS, you need to modify this line.
Hi @wizardk, thank you for the question. I think you can train the model with the 16k sampling rate by modifying the sampling_rate in json file. For example, in the case of MS-iSTFT-VITS, you need to modify this line.
Thanks for your help. But I think it is necessary to modify fft_sizes, hop_sizes, win_lengths, filter_length, hop_length, and win_length as well as sampling_rate. Is that right?
Hi @wizardk, I have the same problem as yours. Do you know what parameters to be adjusted for 16k?
I have the same problem. I am tring the 16K, I changed only the sample_rate parameter, but the synthesiszed speech are bad, they speech too slow, just like I was playing 24KHz audio in 16K format. all phoneme durations are strange.
I have the same problem. I am tring the 16K, I changed only the sample_rate parameter, but the synthesiszed speech are bad, they speech too slow, just like I was playing 24KHz audio in 16K format. all phoneme durations are strange.
Hi @MasayaKawamura , thanks for your work.
I have a question. If I want to use the 16K sampling rate, how do I modify the configuration file? It should not just modify sampling_rate in json.