Open dsplog opened 6 months ago
in the meldataset.py, could see that all wav files are resampled to 24000sps. however, as the MelSpectrogram() transform is called without sample_rate argument defaults to 16000sps.
sample_rate
to_mel = torchaudio.transforms.MelSpectrogram( n_mels=80, n_fft=2048, win_length=1200, hop_length=300) mean, std = -4, 4 def preprocess(wave): wave_tensor = torch.from_numpy(wave).float() mel_tensor = to_mel(wave_tensor) mel_tensor = (torch.log(1e-5 + mel_tensor.unsqueeze(0)) - mean) / std return mel_tensor
questions :
yl4579/StarGANv2-VC#10 and #57, should be helpful.
in the meldataset.py, could see that all wav files are resampled to 24000sps. however, as the MelSpectrogram() transform is called without
sample_rate
argument defaults to 16000sps.questions :