Closed antovespoli3 closed 2 years ago
Looks like the checkpoint file is problematic. I will update the checkpoint file in a few days. Sorry for the inconvenience.
Sounds great, thanks.
Checkpoint updated.
Hi author, by using the provided checkpoint file of Hifi-gan to inference from mel-spectrograms extracted from AutoVC make_spect.py, I got a very low voice. What I'm not sure is, what the config.json file of that checkpoint is like? I noticed some tiny differences in the way mel-spectrograms are calculated that could probably cause the issue. AutoVC introduced fmax and fmin (as high as 90hz) to mel-filterbanks, while the original Higi-GAN didn't use these parameters. Thus I wonder what the config.json used to train the vocoder checkpoint is like. Thanks!
I have tried to use the new hifi-gan model that you recently added to the repository, but I can't quite understand if I am doing it right because I can't reproduce any meaningful sound. So what I am doing is the following:
python inference_e2e.py --checkpoint_file ./g_03295000
where the model is the one downloaded from your google drive linkBy doing so, I get a very high pitched sound.
I also tried reshaping it to (frames, num_mels, 1) instead of (1, num_mels, frames), and I get a silent file.
Am I doing something wrong?
P.s. I am using the config_v1.json as the configuration file for the model