rishikksh20 / TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis
Apache License 2.0
87 stars 19 forks source link

Why does this network ask to enter any spectrogram at the last stage? #9

Open etimsijs opened 3 years ago

etimsijs commented 3 years ago

Hello, could you please tell me why does the network ask to enter any spectrogram at the time of the outputting the result? I mean this command python inference.py -p [checkpoint path] -i [input mel path] Usually , GAN networks generate random noise by themselves, so why does network need mel to output the result ?

rishikksh20 commented 3 years ago

@etimsijs Not in the case of Vocoder GAN like melgan, vocgan or TFGAN. In these GAN we take melspec as input and directly upsample to mels to hop size times to get audio wav, we don't condition noise we directly use mel spectrogram.

etimsijs commented 3 years ago

@etimsijs Not in the case of Vocoder GAN like melgan, vocgan or TFGAN. In these GAN we take melspec as input and directly upsample to mels to hop size times to get audio wav, we don't condition noise we directly use mel spectrogram.

So, is this network for audiophiles ? The network can increase the sampling rate from 22 kHz to 41 kHz, let's say? Is it not for generating audio?