Closed ghost closed 4 years ago
Sure you can do that by using GenderIdentifier
class but you have to slightly edit the FeaturesExtractor
class function. Assuming you have recorded some speech in real-time and formatted it as an audio array (mono). You can compute the features characterising the audio array and later you can use the features vector in the prediction. This can be done like the following:
import numpy as np
from sklearn import preprocessing
from scipy.io.wavfile import read
from python_speech_features import mfcc
from python_speech_features import delta
from GenderIdentifier import GenderIdentifier
def extract_features(audio, rate):
mfcc_feature = mfcc(# The audio signal from which to compute features.
audio,
# The samplerate of the signal we are working with.
rate,
# The length of the analysis window in seconds.
# Default is 0.025s (25 milliseconds)
winlen = 0.05,
# The step between successive windows in seconds.
# Default is 0.01s (10 milliseconds)
winstep = 0.01,
# The number of cepstrum to return.
# Default 13.
numcep = 13,
# The number of filters in the filterbank.
# Default is 26.
nfilt = 30,
# The FFT size. Default is 512.
nfft = 1024,
# If true, the zeroth cepstral coefficient is replaced
# with the log of the total frame energy.
appendEnergy = True)
mfcc_feature = preprocessing.scale(mfcc_feature)
deltas = delta(mfcc_feature, 2)
double_deltas = delta(deltas, 2)
combined = np.hstack((mfcc_feature, deltas, double_deltas))
return combined
# init gender identifier
gender_identifier = GenderIdentifier("TestingData/females",
"TestingData/males",
"females.svm", "males.svm")
# get audio features vector
features_vector = extract_features(audio=recorded_audio_data, rate=sampling_rate)
# predict/identify speaker's gender
predicted_gender = gender_identifier.identify_gender(features_vector)
As for how to get the recorded audio, I suggest following something like this. A few things to mind here are:
Hope this helps :)
This is hugely helpful. Thank you for the code snippet and the link. I'll let you know how I get on with the real-time piping.
I'll try and match the sampling rate as best I can and see what I can come up with.
hi, i am currently working on my second phase of experiment using your source code. thank you :) i just have one doubt: the audio and rate in features vector, im having troubles in sampling_rate when i tried to substitute it with 44100. can you please tell me where im going wrong?
Since OP did not follow on this, I will assume that his issue was solved and close this one. @thxrgxxs please refer to #7.
I see the python files rely on files for input.
Can I pipe a .wav file into a python script?
I'm exploring possibilities for classifying a voice in real time.
I got the program working fine. Very nice code. Thank you.