Error when using high rates

SuperKogito commented 4 years ago

error hi, i am currently working on my second phase of experiment using your source code. thank you :) i just have one doubt: the audio and rate in features vector, im having troubles in sampling_rate when i tried to substitute it with 44100. can you please tell me where im going wrong?

Originally posted by @thxrgxxs in https://github.com/SuperKogito/Voice-based-gender-recognition/issues/4#issuecomment-619394096

SuperKogito commented 4 years ago

please provide your code too and not just the error logs. From your code, this line does not seem correct:

features_vector = extract_features(audio=".wav audio", rate=44100)

The audio variable is supposed to be the audio signal from which to compute features

thxrgxxs commented 4 years ago

import numpy as np
from sklearn import preprocessing
from scipy.io.wavfile import read
from python_speech_features import mfcc
from python_speech_features import delta
from GenderIdentifier import GenderIdentifier

def extract_features(audio, rate):
        mfcc_feature = mfcc(# The audio signal from which to compute features.
                            audio,
                            # The samplerate of the signal we are working with.
                            rate,
                            # The length of the analysis window in seconds. 
                            # Default is 0.025s (25 milliseconds)
                            winlen       = 0.05,
                            # The step between successive windows in seconds. 
                            # Default is 0.01s (10 milliseconds)
                            winstep      = 0.01,
                            # The number of cepstrum to return. 
                            # Default 13.
                            numcep       = 13,
                            # The number of filters in the filterbank.
                            # Default is 26.
                            nfilt        = 30,
                            # The FFT size. Default is 512.
                            nfft         = 1024,
                            # If true, the zeroth cepstral coefficient is replaced 
                            # with the log of the total frame energy.
                            appendEnergy = True)

        mfcc_feature  = preprocessing.scale(mfcc_feature)
        deltas        = delta(mfcc_feature, 2)
        double_deltas = delta(deltas, 2)
        combined      = np.hstack((mfcc_feature, deltas, double_deltas))
        return combined

# init gender identifier
gender_identifier = GenderIdentifier("TestingData/females", 
                                     "TestingData/males", 
                                     "females.gmm", "males.gmm")
# get audio features vector
features_vector = extract_features(audio=".wav audio", rate = 44100)

# predict/identify speaker's gender
predicted_gender = gender_identifier.identify_gender(features_vector)

the recorded audio files are stored in ".wav audio" file in .wav format. so where do i insert the file path? thank you for you guidance, btw :)

SuperKogito commented 4 years ago

I don't think this will work. You are training your identifier using data with 16000 Hz sampling rate and using it to test/ recognize gender from a file with a 44100 Hz. Please, take to a look at the graph in the README and this article, that I wrote in order to get a better grasp of the theory. To have correct results you need to use training and testing data with the same sampling rate.

The docs are very clear about this: https://github.com/SuperKogito/Voice-based-gender-recognition/blob/f266bbc58229925044b0a3782b92dca6b18739bf/Code/FeaturesExtractor.py#L20 so: features_vector = extract_features(audio="INSERT-AUDIO-FILE-PATH-HERE", rate = 44100)

thxrgxxs commented 4 years ago

alright, i'll work on this and let you know how it turns out.

thxrgxxs commented 4 years ago

i converted all my audio files into 16000 and also inserted the file path. but it keeps showing the same error in the "rate" variable.

SuperKogito commented 4 years ago

if you are using the correct rates and have placed the files in the right folders then running the features extraction, training and testing code sections should be straightforward. Without seeing the code nor the error logs, I can only guess what's wrong. So without the code or the errors, I am afraid that I cannot provide any reliable input.

thxrgxxs commented 4 years ago

error code.zip

this is the wav files i recorded and tried to run using the FeatureExtractor1.py (real time gender classification) code. all the audio files are in 16000Hz. can you please help me out? thanks

SuperKogito / Voice-based-gender-recognition

Error when using high rates #7