tyiannak / pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Apache License 2.0
5.75k stars 1.18k forks source link

Signal from read_audio_generic causes ValueError in feature_extraction #376

Open FrancescoManfredi opened 1 year ago

FrancescoManfredi commented 1 year ago

What is the problem

The feature_extraction function requires the signal to be of shape (m, ) and fails when given a signal of shape (m, 1).
Try running the following code as a test:

from pyAudioAnalysis import ShortTermFeatures as aF
from pyAudioAnalysis import audioBasicIO as aIO
import numpy as np
Fs, s = aIO.read_audio_file("data/mio_audio.wav")
print(s.shape)
Fs2, s2 = aIO.read_audio_generic("data/audio_test.mp3")
print(s2.shape)
# extracting features directly from the first signal is ok
_, _ = aF.feature_extraction(s, Fs, 500, 500, deltas=False)
# extracting features from the second requires reshaping
# causes
# ValueError: shapes (250,1) and (250,40) not aligned: 1 (dim 1) != 250 (dim 0)
_, _ = aF.feature_extraction(s2, Fs2, 500, 500, deltas=False)
# reshaping to (m, ) fixes the issue
s2 = s2.reshape((s2.shape[0], ))
_, _ = aF.feature_extraction(s2, Fs2, 500, 500, deltas=False)

Why this matters

This is not a critical flaw but the inconsistency is annoying as it might be hard to spot when working with read_audio_generic.

Proposed fix

Add a check at the beginning of feature_extraction to detect signals in the form (m, 1) and reshape them to (m, ).
OR
Return a signal with the same shape as read_audio_file from read_audio_generic.