RayanWang / Speech_emotion_recognition_BLSTM

Bidirectional LSTM network for speech emotion recognition.
MIT License
263 stars 79 forks source link

How to modify the code in audioFeatureExtraction.py to fix this error #4

Closed hirunifernando closed 6 years ago

hirunifernando commented 6 years ago

========================================================= Writing berlin data set to file... Traceback (most recent call last): File "/home/lwin/speech-emotion/Speech_emotion_recognition_BLSTM-master/find_best_model.py", line 163, in functions.feature_extract(ds.data, nb_samples=len(ds.targets), dataset=dataset) File "/home/lwin/speech-emotion/Speech_emotion_recognition_BLSTM-master/utility/functions.py", line 20, in feature_extract hr_pitch = audioFeatureExtraction.stFeatureSpeed(x, Fs, globalvars.frame_size Fs, globalvars.step Fs) File "/usr/local/lib/python3.5/dist-packages/pyAudioAnalysis/audioFeatureExtraction.py", line 669, in stFeatureSpeed [fbank, freqs] = mfccInitFilterBanks(Fs, nfft, lowfreq, linsc, logsc, nlinfil, nlogfil) TypeError: mfccInitFilterBanks() takes 2 positional arguments but 7 were given

================================================================= How to solve this issue.plz help me .Thank u

RayanWang commented 6 years ago

Hi hirunifernando, [fbank, freqs] = mfccInitFilterBanks(Fs, nfft, lowfreq, linsc, logsc, nlinfil, nlogfil) just remove the parameters starting from the 3rd while invoking mfccInitFilterBanks.

hirunifernando commented 6 years ago

Thank you RayanWang.I really appreciate ur early reply If I use the different duration of audio files, should I need to change the frame_size and step? Is it affect the feature vector.

RayanWang commented 6 years ago

The frame_size and step are the same as those mentioned in the Microsoft paper. It has nothing to do with the file duration. The only one thing you have to pay attention to is the max length(max_len) in feature extraction. At present I'm hard code this value to 1024(for padding sequences). If your audio file is very long, you may have to increase this value.

hirunifernando commented 6 years ago

Thank u.I got the point . Again I have an issue to fix ile "C:\Program Files\Python36\lib\site-packages\pyAudioAnalysis\audioFeatureExtraction.py", line 197, in mfccInitFilterBanks fbank = numpy.zeros((nFiltTotal, nfft)) TypeError: 'float' object cannot be interpreted as an integer

I have modified only the 4 places from float to int ex-> x = signal[int(cur_p):int(cur_p+win)] (in4 places) Is there extra places to modify float to int

RayanWang commented 6 years ago

Just modify this line of code "nfft = win / 2" -> "nfft = int(win / 2)" in function stFeatureSpeed.

hirunifernando commented 6 years ago

Once done above mention modification and M to integer(M = int(M))., again I got an error related np. File "find_best_model.py", line 168, in extractor.extract_dataset(ds.data, nb_samples=len(ds.targets), dataset=dataset) File "D:\New folder\Speech_emotion_recognition_BLSTM-master\Speech_emotion_recognition_BLSTM-master\utility\audio.py", line 274, in extract_dataset f = np.append(f, hr_pitch.transpose(), axis=0) File "C:\Program Files\Python36\lib\site-packages\numpy\lib\function_base.py", line 5160, in append arr = asanyarray(arr) File "C:\Program Files\Python36\lib\site-packages\numpy\core\numeric.py", line 544, in asanyarray return array(a, dtype, copy=False, order=order, subok=True) ValueError: could not broadcast input array from shape (34,188) into shape (34) plz help me to solve this.Thank u for ur great support

RayanWang commented 6 years ago

It's python3 compatibility issue. Already fixed just now.

hirunifernando commented 6 years ago

Thank you so much for the help you gave me these days. I greatly appreciate the assistance and the earliest responses you have provided me.

yao-1115 commented 2 years ago

I face this problem, need some hand for help This is part of the code from audioFeatureExtractiom.py

Compute filterbank coeff (in fft domain, in bins)

fbank = numpy.zeros((nFiltTotal, nfft))
nfreqs = numpy.arange(nfft) / (1. * nfft) * fs

The error is given as shown below: line 271, in mfccInitFilterBanks fbank = numpy.zeros((nFiltTotal, nfft)) TypeError: 'float' object cannot be interpreted as an integer