Closed findmenowhere closed 3 years ago
Here's my code to extract features from a single recording:
import librosa
def extract(wav):
# Takes a waveform (length 160,000, sampling rate 16,000) and extracts filterbank features (size 400 * 64)
spec = librosa.core.stft(wav, n_fft = 4096,
hop_length = 400, win_length = 1024,
window = 'hann', center = True, pad_mode = 'constant')
mel = librosa.feature.melspectrogram(S = numpy.abs(spec), sr = 16000, n_mels = 64, fmax = 8000)
logmel = librosa.core.power_to_db(mel[:, :400])
return logmel.T.astype('float32')
After feature extraction, I normalized each dimension to have zero mean and unit variance globally (i.e. across all training recordings).
If you downloaded the data from my GitHub repo, you should find "normalizer.pkl" files that contain the "mu" and "sigma" for normalization.
Thanks! That's what I need.
Hi Maigo,
How do you get filterbank features from audios? I didn't find any code related to data processing and the data downloaded from the bash file is already prepared.