auspicious3000 / autovc

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
https://arxiv.org/abs/1905.05879
MIT License
976 stars 207 forks source link

hifi_gan sampling rate #106

Closed zjuPeco closed 2 years ago

zjuPeco commented 2 years ago

Hi, author!

The default hifi_gan sampling rate in "config_v1.json" is 22050. But when I use the provided "g_03280000" checkpoint file, i need to set the output sampling rate as 16000 to make the sound normal.

Do you change the sampling rate to 16000 while training hifi-gan?

MAX_WAV_VALUE = 32768.0
hifi_gan_path = "/work/pretrained_weights/g_03280000"
hifi_config_path = "/work/hifi_gan/config_v1.json"
with open(hifi_config_path, 'r') as f:
    data = f.read()
json_config = json.loads(data)
h = AttrDict(json_config)
hifi_generator = hifigan_model(h).to(device)

state_dict_g = load_checkpoint(hifi_gan_path, device)
hifi_generator.load_state_dict(state_dict_g['generator'])
hifi_generator.eval()
hifi_generator.remove_weight_norm()

with torch.no_grad():
    result_trans = result.transpose()
    y_g_hat = hifi_generator(torch.Tensor(result_trans[np.newaxis, :, :]).to(device))
    audio = y_g_hat.squeeze()
    audio = audio * MAX_WAV_VALUE
    audio = audio.cpu().numpy().astype('int16')

ipd.Audio(audio, rate=16000)
auspicious3000 commented 2 years ago

Yes. Sampling rate is 16khz.