Closed EmreOzkose closed 2 years ago
A vocoder was used to extract features so that we could listen to the converted audio (by passing the converted mel-spectrogram through the vocoder in the reverse direction).
Just for convenience between converted mel and vocoder, right? Thank you. Do you think if changing pre-processing of mel is a significant change for generating good sounds?
In my case, the vocoder is insufficient to produce good sounds. I have a mb-melgan which is trained on my data and works well. I will do experiments with changed mel, and report here.
I've the same doubt, @EmreOzkose did you manage to arrange a different mel conversion?
I conducted experiments with changed mels. I trained mb-melgan with changed mels. Voice conversion was not good enough compared to default mels.
Great to know, thanks!
Hi!
Thanks for sharing code.
Why did you use vocoder() funtion to extract features? These features are mel-spectrograms, right? Why didn't use just librosa? I tried and differences between extracted featurs are not negligible. Ranges are so different.