sh-lee-prml / HierSpeechpp

The official implementation of HierSpeech++
MIT License
1.13k stars 134 forks source link

Multi Scale STFT Discriminator checkpoint #43

Closed chnk58hoang closed 3 months ago

chnk58hoang commented 3 months ago

Hello. First of all, thank you very much for your great work ! Now I'm trying to finetune the hierarchical speech synthesizer on my own dataset. In my understanding, in your adversarial training process, you adopted 2 different discriminators (the multi-period discriminator (MPD) and the multi-scale STFT discriminator). In #4 , I saw you only shared the checkpoint of MPD. Can you kindly share the checkpoint of MS-STFT ? If I'm misunderstood, please correct me ! Once again, your work is wonderful !

sh-lee-prml commented 3 months ago

Hi

Sorry for the confusion.

MultiPeriodDiscriminator includes both MPD (DiscriminatorP) and Multi-scale STFT (DiscriminatorR)

In https://github.com/sh-lee-prml/HierSpeechpp/blob/main/hierspeechpp_speechsynthesizer.py#L536C1-L545C51

class MultiPeriodDiscriminator(torch.nn.Module):
    def __init__(self, use_spectral_norm=False):
        super(MultiPeriodDiscriminator, self).__init__()
        periods = [2,3,5,7,11]
        resolutions = [[2048, 512, 2048], [1024, 256, 1024], [512, 128, 512], [256, 64, 256], [128, 32, 128]]

        discs = [DiscriminatorR(resolutions[i], use_spectral_norm=use_spectral_norm) for i in range(len(resolutions))]
        discs = discs + [DiscriminatorP(i, use_spectral_norm=use_spectral_norm) for i in periods]

        self.discriminators = nn.ModuleList(discs)

Thanks!

chnk58hoang commented 3 months ago

Thanks a lot for your reply !