maum-ai / univnet

Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)
https://mindslab-ai.github.io/univnet/
BSD 3-Clause "New" or "Revised" License
264 stars 46 forks source link

Testing from .wav failed #10

Open kelseyjd opened 1 year ago

kelseyjd commented 1 year ago

Hi, I ran the following testing code to convert .wav -> mel using librosa and then Univnet with pretrained checkpoint to do the inverse but the results were extremely bad. Can you point out what I'm doing wrong? The input file is clean, US english speech. arguments: -p ./chkpt/univ_c16_0292.pt -c config/default_c16.yaml -i /Users/kelseyd/Documents/train/TF -o ./out

for filename in tqdm.tqdm(glob.glob(os.path.join(args.input_folder, '*.wav'))): y, sr = librosa.load(filename,sr=24000) mel=librosa.feature.melspectrogram(y=y, sr=sr, n_fft=1024, n_mels=100, fmin=0, fmax=12000) mel = torch.from_numpy(mel)

        if len(mel.shape) == 2:
            mel = mel.unsqueeze(0)

        audio = model.inference(mel)
        audio = audio.cpu().detach().numpy()