NVIDIA / tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference
BSD 3-Clause "New" or "Revised" License
5.11k stars 1.39k forks source link

lnferernce problem: The tensor element of audio is nan #589

Open OuYiqiang opened 1 year ago

OuYiqiang commented 1 year ago

l am new to deep learning.l heard the silence of audio when l ran the infenence code.Pycharm's debugging results show that the audioelement is nan.l don't know what could have caused this and l have not changed any part of the code.Does anyone know how to solve this problem?

libowen424 commented 1 year ago

I have the same question that the tensor element of audio is nan. I've found that you need to remove the .half() function from the model after comparing it with NVIDIA's official inference code..

checkpoint_path = "tacotron2_statedict.pt"
model = load_model(hparams)
model.load_state_dict(torch.load(checkpoint_path)['state_dict'])
_ = model.cuda().eval()

waveglow_path = 'waveglow_256channels_universal_v5.pt'
waveglow = torch.load(waveglow_path)['model']
waveglow.cuda().eval()