Open yanglu1994 opened 2 years ago
I'm able to train and inference & in 44100Hz from this config:
{
"resblock": "1",
"num_gpus": 3,
"batch_size": 8,
"learning_rate": 0.0002,
"adam_b1": 0.8,
"adam_b2": 0.99,
"lr_decay": 0.9995,
"seed": 1234,
"upsample_rates": [ 8, 8, 2, 2, 2],
"upsample_kernel_sizes": [16,16, 4, 4, 4],
"upsample_initial_channel": 512,
"resblock_kernel_sizes": [3,7,11],
"resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]],
"discriminator_periods": [3, 5, 7, 11, 17, 23, 37],
"segment_size": 16384,
"num_mels": 80,
"num_freq": 1025,
"n_fft" : 2048,
"hop_size": 512,
"win_size": 2048,
"sampling_rate": 44100,
"fmin": 20,
"fmax": 11025,
"fmax_for_loss": null,
"num_workers": 4,
"dist_config": {
"dist_backend": "nccl",
"dist_url": "tcp://localhost:54321",
"world_size": 1
}
}
I'm not able to train with the hop size 512 when the sampling rate is 44100, but it works when the hop size is 441.
@Grace9994 could you share your config? I can't find the segment_size
, the output audio duration does not exactly match input audio
if wavs sample rate is 48k, how to set the upsample parameters? when wavs sampler rate is 48k, the hop size is 600.the config in code is only upsampled 256.so it will run error when calculate the loss.