tyiannak / pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Apache License 2.0
5.88k stars 1.2k forks source link

silenceremoval ValueError: shape mismatch when nChroma.max() > nChroma.shape[0] #232

Open giusarno opened 5 years ago

giusarno commented 5 years ago

Get an error for 8Khz Wav file when I run this simple example. Works for 16Khz recordings.

`from pyAudioAnalysis import audioBasicIO as aIO from pyAudioAnalysis import audioSegmentation as aS [Fs, x] = aIO.readAudioFile("recs/Wallet1.wav")

print (Fs)

print (x) segments = aS.silenceRemoval(x, Fs, 0.020, 0.020, smoothWindow = 0.6, weight = 0.3, plot = True)

segments = aS.silenceRemoval(x, Fs, 0.020, 0.020)

`


ValueError Traceback (most recent call last)

in () 4 #print (Fs) 5 print (x) ----> 6 segments = aS.silenceRemoval(x, Fs, 0.020, 0.020, smoothWindow = 0.6, weight = 0.3, plot = True) 7 #segments = aS.silenceRemoval(x, Fs, 0.020, 0.020) ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pyAudioAnalysis/audioSegmentation.py in silenceRemoval(x, fs, st_win, st_step, smoothWindow, weight, plot) 646 x = audioBasicIO.stereo2mono(x) 647 st_feats, _ = aF.stFeatureExtraction(x, fs, st_win * fs, --> 648 st_step * fs) 649 650 # Step 2: train binary svm classifier of low vs high energy frames ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pyAudioAnalysis/audioFeatureExtraction.py in stFeatureExtraction(signal, fs, win, step) 590 curFV[n_time_spectral_feats:n_time_spectral_feats+n_mfcc_feats, 0] = \ 591 stMFCC(X, fbank, n_mfcc_feats).copy() # MFCCs --> 592 chromaNames, chromaF = stChromaFeatures(X, fs, nChroma, nFreqsPerChroma) 593 curFV[n_time_spectral_feats + n_mfcc_feats: 594 n_time_spectral_feats + n_mfcc_feats + n_chroma_feats - 1] = \ ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pyAudioAnalysis/audioFeatureExtraction.py in stChromaFeatures(X, fs, nChroma, nFreqsPerChroma) 269 I = numpy.nonzero(nChroma>nChroma.shape[0])[0][0] 270 C = numpy.zeros((nChroma.shape[0],)) --> 271 C[nChroma[0:I-1]] = spec 272 C /= nFreqsPerChroma 273 finalC = numpy.zeros((12, 1)) ValueError: shape mismatch: value array of shape (80,) could not be broadcast to indexing result of shape (56,)
giusarno commented 5 years ago

this does not seem directly related to the sample frequency but to the if statement in stChromaFeatures if nChroma.max()<nChroma.shape[0]:

IvanEvan commented 5 years ago

I had the same problem. I stucked here...

nofi-sys commented 4 years ago

I had the same issue. In my case, I was using silenceRemoval() from audioSegmentation.py.

I solved it by changing st_win and st_step (window size and step in seconds) from 0.020 to 0.040.

This is the line:

self.segments = aS.silenceRemoval(self.audio_x, self.Fs, 0.040, 0.040, weight=0.9, smoothWindow=0.9 , plot=False)

Hope this is useful.