librosa / librosa

Python library for audio and music analysis
https://librosa.org/
ISC License
7.18k stars 965 forks source link

filters.mel: ValueError: operands could not be broadcast together with shapes (1,1025) (0,) #826

Closed albertz closed 5 years ago

albertz commented 5 years ago

Description

I get an exception in filters.mel: ValueError: operands could not be broadcast together with shapes (1,201) (0,). In line lower = -ramps[i] / fdiff[i], where fdiff = array([], shape=(130, 0), dtype=float64).

The same problem was also reported here:

Steps/Code to Reproduce

Example:

      audio = numpy.array([ 2.78090321e-03,  3.32838898e-03,  4.34225125e-03, ...,
                           -4.98423541e-05, -1.79219493e-04,  1.82329724e-04])
      sample_rate = 16000
      num_feature_filters =  40
      step_len = 0.01
      window_len =  0.025

      mfccs = librosa.feature.mfcc(
            audio, sr=sample_rate,
            n_mfcc=num_feature_filters,
            hop_length=int(step_len * sample_rate), n_fft=int(window_len * sample_rate))

Expected Results

It works. It has also worked in the past.

Actual Results

Exception:

  File "/u/zeyer/setups/librispeech/2018-02-26--att/returnn/GeneratingDataset.py", line 814, in _get_audio_features_mfcc
    line: mfccs = librosa.feature.mfcc(
            audio, sr=sample_rate,
            n_mfcc=num_feature_filters,
            hop_length=int(step_len * sample_rate), n_fft=int(window_len * sample_rate))
    locals:
      mfccs = <not found>
      librosa = <local> <module 'librosa' from '/u/zeyer/.local/lib/python3.6/site-packages/librosa/__init__.py'>
      librosa.feature = <local> <module 'librosa.feature' from '/u/zeyer/.local/lib/python3.6/site-packages/librosa/feature/__init__.py'>
      librosa.feature.mfcc = <local> <function mfcc at 0x7f98700e9488>
      audio = <local> array([ 2.78090321e-03,  3.32838898e-03,  4.34225125e-03, ...,
                             -4.98423541e-05, -1.79219493e-04,  1.82329724e-04]), len = 22912
      sr = <not found>
      sample_rate = <local> 16000
      n_mfcc = <not found>
      num_feature_filters = <local> 40
      hop_length = <not found>
      int = <builtin> <class 'int'>
      step_len = <local> 0.01
      n_fft = <not found>
      window_len = <local> 0.025
  File "/u/zeyer/.local/lib/python3.6/site-packages/librosa/feature/spectral.py", line 1299, in mfcc
    line: S = power_to_db(melspectrogram(y=y, sr=sr, **kwargs))
    locals:
      S = <local> None
      power_to_db = <global> <function power_to_db at 0x7f98700d1048>
      melspectrogram = <global> <function melspectrogram at 0x7f98700e9510>
      y = <local> array([ 2.78090321e-03,  3.32838898e-03,  4.34225125e-03, ...,
                         -4.98423541e-05, -1.79219493e-04,  1.82329724e-04]), len = 22912
      sr = <local> 16000
      kwargs = <local> {'hop_length': 160, 'n_fft': 400}
  File "/u/zeyer/.local/lib/python3.6/site-packages/librosa/feature/spectral.py", line 1391, in melspectrogram
    line: mel_basis = filters.mel(sr, n_fft, **kwargs)
    locals:
      mel_basis = <not found>
      filters = <global> <module 'librosa.filters' from '/u/zeyer/.local/lib/python3.6/site-packages/librosa/filters.py'>
      filters.mel = <global> <function mel at 0x7f98701338c8>
      sr = <local> 16000
      n_fft = <local> 400
      kwargs = <local> {}
  File "/u/zeyer/.local/lib/python3.6/site-packages/librosa/filters.py", line 247, in mel
    line: lower = -ramps[i] / fdiff[i]
    locals:
      lower = <not found>
      ramps = <local> array([[[ 0.00000000e+00, -4.00000000e+01, -8.00000000e+01, ...,
...                             [[ 4.67655199e+01,  6.76551987e+00..., len = 130, _[0]: {len = 1, _[0]: {len = 201}}
      i = <local> 0
      fdiff = <local> array([], shape=(130, 0), dtype=float64), len = 130, _[0]: {len = 0}
ValueError: operands could not be broadcast together with shapes (1,201) (0,)

Versions

In [6]: librosa.version.show_versions()
INSTALLED VERSIONS
------------------
python: 3.6.3 (default, Oct 25 2017, 11:03:15) 
[GCC 5.4.0 20160609]

librosa: 0.5.1

audioread: 2.1.5
numpy: 1.16.0
scipy: 1.1.0
scikit-learn: None
joblib: 0.11
decorator: 4.3.0
six: 1.11.0
resampy: 0.2.0

numpydoc: None
sphinx: None
sphinx_rtd_theme: None
sphinxcontrib-versioning: None
matplotlib: 2.1.0
numba: 0.35.0

>>> import platform; print(platform.platform())
Linux-4.4.0-53-generic-x86_64-with-debian-stretch-sid
>>> import sys; print("Python", sys.version)
Python 3.6.3 (default, Oct 25 2017, 11:03:15) 
[GCC 5.4.0 20160609]
>>> import numpy; print("NumPy", numpy.__version__)
NumPy 1.16.0
>>> import scipy; print("SciPy", scipy.__version__)
SciPy 1.1.0
>>> import librosa; print("librosa", librosa.__version__)
librosa 0.5.1
albertz commented 5 years ago

Ok, after updating to librosa 0.6.2, this issue seems to be gone. So maybe this can be closed. But I guess it's useful to have this as a reference for others who stumble upon this exception.

lostanlen commented 5 years ago

Hello. Glad to find out that you sorted out the issue by updating. For completeness, what was the shape of the audio array?

albertz commented 5 years ago

audio.shape == (22912,). You can see that also in the stacktrace output.

bmcfee commented 5 years ago

I'm glad that this is now fixed, but I'm a bit confused about where it came from. As far as I can tell, nothing has changed in the mel filter construction code between 0.5.1 and now (0.6.2/3). Did anything else change in the dependencies when you upgraded to 0.6.2?

albertz commented 5 years ago

I also wondered about that. I briefly went through the librosa code and did not really understand how this could happen. I was guessing that it maybe was due to the caching mechanism somehow, or maybe multi-threading related, but instead of debugging further, I was just trying the latest version (also because I knew that it has worked in the past, so for some reason I somehow got the older version installed). When upgrading librosa again, as far as I remember, no other package was updated.

bmcfee commented 5 years ago

I dug into this a bit, and I think the only thing that could have caused this was a change we made in hz_to_mel to support arbitrary array shape inputs (including scalars) #628. The old (0.5.1) implementation always cast to 1d, while post-#628 code preserves the shape of the input.

I don't quite see how exactly this would cause the problem we're seeing here, unless there's some gross incompatibility with the frequency ranges in the filter bank construction, but seeing as the current behavior is correct, I feel okay closing this out.

Firedope commented 5 years ago

I am also having same kind of issue. Can anyone help me pls..?

albertz commented 5 years ago

I am also having same kind of issue. Can anyone help me pls..?

Just update to librosa 0.6.2.

Firedope commented 5 years ago

Ok thanks 😊

boeddeker commented 5 years ago

For documentation. In numpy 1.16 numpy changed linspace to consider the input shape. This causes in the old hz_to_mel this bug, because it returned always an array with at least 1d.

Numpy 1.16: https://docs.scipy.org/doc/numpy/release.html

Start and stop arrays for linspace, logspace and geomspace These functions used to be limited to scalar stop and start values, but can now take arrays, which will be properly broadcast and result in an output which has one axis prepended. This can be used, e.g., to obtain linearly interpolated points between sets of points.

bmcfee commented 5 years ago

Thanks @boeddeker for tracking that down! I would never have figured that out.

Seeing as it's far too late to retroactively add a version pin on old librosa packages, I think there's nothing to be done here.

mrgloom commented 5 years ago

Same issue, which numpy version is compatible with '0.5.1'?

import librosa

librosa.filters.mel(22050, 2048, 80, 0, 8000.0)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/Anaconda/lib/python3.6/site-packages/librosa/filters.py", line 247, in mel
    lower = -ramps[i] / fdiff[i]
ValueError: operands could not be broadcast together with shapes (1,1025) (0,)

librosa.__version__
'0.5.1'

np.__version__
'1.16.1'
mrgloom commented 5 years ago

At least I was able to run it with numpy==1.14.3 https://github.com/mozilla/TTS/blob/master/requirements.txt

bmcfee commented 5 years ago

@mrgloom I'm glad that you got it working, but your librosa version is several years out of date. I'd strongly recommend upgrading.

vishal2106 commented 4 years ago

At least I was able to run it with numpy==1.14.3 https://github.com/mozilla/TTS/blob/master/requirements.txt

Thank you so much. You saved our project.