seungwonpark / melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)
http://swpark.me/melgan/
BSD 3-Clause "New" or "Revised" License
633 stars 116 forks source link

Questions to use melgan on my own dataset #29

Closed geekboood closed 4 years ago

geekboood commented 4 years ago

Hi, I encounter some problems when I try to use melgan on my dataset. The first one is that you comment in the default.yaml that we should leave the hop_length to 256. Why can't I change the value? Is this some limitations of the model structure? The second question is that in the MelFromDisk class, you use a mapping in the __getitem__ under training set. What is this mapping used for? I think the input idx is between [0, len(wav_list)) and the mapping also has the same interval.

seungwonpark commented 4 years ago

Is this some limitations of the model structure?

Yes, the model architecture upsamples the mel-spectrogram by 256 times, so the hop_length can't be changed.

What is this mapping used for?

I intended to use different batch for D/G update in a single epoch by using this random mapping.