auspicious3000 / SpeechSplit

Unsupervised Speech Decomposition Via Triple Information Bottleneck
http://arxiv.org/abs/2004.11284
MIT License
636 stars 92 forks source link

Downsampling for VCTK corpus #38

Open biggytruck opened 3 years ago

biggytruck commented 3 years ago

The sampling rate of the VCTK corpus is 48K Hz while the model requires the sampling rate to be 16K Hz. To match the sampling rate, I used librosa's resample function and my code looks like:

import librosa

y, sr = librosa.load(wav_file, sr=48000)
y_16k = librosa.resample(y, sr, 16000)

Is this the same code you used for downsampling the audios? I want to clarify this because I want to make sure the data distribution is the same.

auspicious3000 commented 3 years ago

No, but this shouldn't matter.