Open narendranp opened 2 years ago
"upsample_rates": [2,5,4,4],
"upsample_kernel_sizes": [16,15,4,4],
"upsample_initial_channel": 512,
"resblock_kernel_sizes": [3,7,11],
"resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]],
"resblock_initial_channel": 256,
"segment_size": 5120,
"num_mels": 80,
"num_freq": 512,
"n_fft": 512,
"hop_size": 160,
"win_size": 512,
"sampling_rate": 16000,
i use these parameters training at 16KHz.
"upsample_rates": [2,5,4,4], "upsample_kernel_sizes": [16,15,4,4], "upsample_initial_channel": 512, "resblock_kernel_sizes": [3,7,11], "resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]], "resblock_initial_channel": 256, "segment_size": 5120, "num_mels": 80, "num_freq": 512, "n_fft": 512, "hop_size": 160, "win_size": 512, "sampling_rate": 16000,
i use these parameters training at 16KHz.
would you mind sharing the trained checkpoint? thank you in advance!
"upsample_rates": [2,5,4,4], "upsample_kernel_sizes": [16,15,4,4], "upsample_initial_channel": 512, "resblock_kernel_sizes": [3,7,11], "resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]], "resblock_initial_channel": 256, "segment_size": 5120, "num_mels": 80, "num_freq": 512, "n_fft": 512, "hop_size": 160, "win_size": 512, "sampling_rate": 16000,
i use these parameters training at 16KHz.
"upsample_kernel_sizes": [16,15,4,4]? Is the 15/2 the stride of the second De_Conv ?
Mind if I ask why you set n_fft to 512 instead of 4 times of hop_size(4*160=640), which is a usually default setting in STFT?
Hi, Does any know where the pretrained Hifi GAN vocoder that works at 16KHz is available. OR Can any one have config file (hyper parameters setting at 16K Hz) that gives the best possible quality at 16 KHz. I am not able to get the right parameter settings for 16KHz vocoder. It would be very help full for me.
Thanks in advance, Narendra