tyiannak / pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Apache License 2.0
5.79k stars 1.19k forks source link

Error in training HMM model #62

Open SimoMoilanen opened 7 years ago

SimoMoilanen commented 7 years ago

Hello, I'm trying to train a HMM model from a 15 minute wav file. It crashes with following traceback:

Traceback (most recent call last):
  File "trainer.py", line 20, in <module>
    aS.trainHMM_fromFile('training/hmm/chunk.wav', 'training/hmm/chunk.segments', 'trained/hmm_binary', 2.0, 2.0) # train using a single file
  File "/Users/simo/koodaus/tandem_editor/pyAudioAnalysis/audioSegmentation.py", line 337, in trainHMM_fromFile
    [F, _] = aF.mtFeatureExtraction(x, Fs, mtWin * Fs, mtStep * Fs, round(Fs * 0.050), round(Fs * 0.050))    # feature extraction
  File "/Users/simo/koodaus/tandem_editor/pyAudioAnalysis/audioFeatureExtraction.py", line 610, in mtFeatureExtraction
    stFeatures = stFeatureExtraction(signal, Fs, stWin, stStep)
  File "/Users/simo/koodaus/tandem_editor/pyAudioAnalysis/audioFeatureExtraction.py", line 572, in stFeatureExtraction
    curFV[2] = stEnergyEntropy(x)                    # short-term entropy of energy
  File "/Users/simo/koodaus/tandem_editor/pyAudioAnalysis/audioFeatureExtraction.py", line 48, in stEnergyEntropy
    subWindows = frame.reshape(subWinLength, numOfShortBlocks, order='F').copy()
ValueError: cannot reshape array of size 4800 into shape (240,10)

chunk.segments:

0.00,391.00,speech
391.00,921.78,other

Operating system: OS X El Capitan 10.11.4 Python 2.7.13 numpy==1.12.0

I can train and segment with other models (svm, knn, extratrees, gradientboosting and randomforest) successfully.

spnichol commented 7 years ago

I'm also having an issue training an HMM model. Getting a similar error here:


  File "/usr/local/lib/python2.7/dist-packages/pyAudioAnalysis/audioFeatureExtraction.py", line 48, in stEnergyEntropy
    subWindows = frame.reshape(subWinLength, numOfShortBlocks, order='F').copy()

ValueError: total size of new array must be unchanged
spnichol commented 7 years ago

@SimoMoilanen I tried converting the WAV files to a sample rate of 16000 (what the package author uses in his examples) and now it works without any problems.

rishavag1995 commented 7 years ago

I had the same problem. Then I discovered that the audio that I was using was stereo, and we required a mono audio file.

matteocollina commented 5 years ago

@rishavag1995 he is right. I recommend using pydub to convert to wav and mono.

from pydub import AudioSegment

sound = AudioSegment.from_file("/path/to/file.wav", format="wav")
sound = sound.set_channels(1)
sound.export(output_filename, format='wav')