Closed hirunifernando closed 6 years ago
Hi hirunifernando, [fbank, freqs] = mfccInitFilterBanks(Fs, nfft, lowfreq, linsc, logsc, nlinfil, nlogfil) just remove the parameters starting from the 3rd while invoking mfccInitFilterBanks.
Thank you RayanWang.I really appreciate ur early reply If I use the different duration of audio files, should I need to change the frame_size and step? Is it affect the feature vector.
The frame_size and step are the same as those mentioned in the Microsoft paper. It has nothing to do with the file duration. The only one thing you have to pay attention to is the max length(max_len) in feature extraction. At present I'm hard code this value to 1024(for padding sequences). If your audio file is very long, you may have to increase this value.
I have modified only the 4 places from float to int ex-> x = signal[int(cur_p):int(cur_p+win)] (in4 places) Is there extra places to modify float to int
Just modify this line of code "nfft = win / 2" -> "nfft = int(win / 2)" in function stFeatureSpeed.
Once done above mention modification and M to integer(M = int(M))., again I got an error related np.
File "find_best_model.py", line 168, in
It's python3 compatibility issue. Already fixed just now.
Thank you so much for the help you gave me these days. I greatly appreciate the assistance and the earliest responses you have provided me.
I face this problem, need some hand for help This is part of the code from audioFeatureExtractiom.py
fbank = numpy.zeros((nFiltTotal, nfft))
nfreqs = numpy.arange(nfft) / (1. * nfft) * fs
The error is given as shown below: line 271, in mfccInitFilterBanks fbank = numpy.zeros((nFiltTotal, nfft)) TypeError: 'float' object cannot be interpreted as an integer
========================================================= Writing berlin data set to file... Traceback (most recent call last): File "/home/lwin/speech-emotion/Speech_emotion_recognition_BLSTM-master/find_best_model.py", line 163, in functions.feature_extract(ds.data, nb_samples=len(ds.targets), dataset=dataset) File "/home/lwin/speech-emotion/Speech_emotion_recognition_BLSTM-master/utility/functions.py", line 20, in feature_extract hr_pitch = audioFeatureExtraction.stFeatureSpeed(x, Fs, globalvars.frame_size Fs, globalvars.step Fs) File "/usr/local/lib/python3.5/dist-packages/pyAudioAnalysis/audioFeatureExtraction.py", line 669, in stFeatureSpeed [fbank, freqs] = mfccInitFilterBanks(Fs, nfft, lowfreq, linsc, logsc, nlinfil, nlogfil) TypeError: mfccInitFilterBanks() takes 2 positional arguments but 7 were given
================================================================= How to solve this issue.plz help me .Thank u