BirdVox / birdvoxdetect

A pre-trained deep learning system for detecting bird flight calls in continuous recordings
MIT License
78 stars 15 forks source link

librosa.util.exceptions.ParameterError: Invalid shape for monophonic audio: ndim=2, shape=(5221248, 2) #44

Closed ses4j closed 4 years ago

ses4j commented 4 years ago

I think the error just means I need to convert stereo to mono. That seems to resolve it.


C:\wc\record-nfc\.venv\lib\site-packages\librosa\util\decorators.py:9: NumbaDeprecationWarning: An import was requested from a module that has moved location.
Import requested from: 'numba.decorators', please update to use 'numba.core.decorators' or pin to Numba version 0.48.0. This alias will not be present in Numba version 0.50.0.
  from numba.decorators import jit as optional_jit
C:\wc\record-nfc\.venv\lib\site-packages\librosa\util\decorators.py:9: NumbaDeprecationWarning: An import was requested from a module that has moved location.
Import of 'jit' requested from: 'numba.decorators', please update to use 'numba.core.decorators' or pin to Numba version 0.48.0. This alias will not be present in Numba version 0.50.0.
  from numba.decorators import jit as optional_jit
birdvoxdetect: Threshold = 40.0
birdvoxdetect: Duration of exported clips = 2.00 seconds.
birdvoxdetect: Processing: D:\birdrecordings\2020-05-07-NFC-recordings\200507-224636.WAV
Traceback (most recent call last):
  File "C:\Program Files\Python37\Lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Program Files\Python37\Lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\wc\record-nfc\.venv\lib\site-packages\birdvoxdetect\__main__.py", line 7, in <module>
    main()
  File "C:\wc\record-nfc\.venv\lib\site-packages\birdvoxdetect\cli.py", line 200, in main
    logger_level=logger_level)
  File "C:\wc\record-nfc\.venv\lib\site-packages\birdvoxdetect\cli.py", line 89, in run
    logger_level=logger_level)
  File "C:\wc\record-nfc\.venv\lib\site-packages\birdvoxdetect\core.py", line 529, in process_file
    chunk_pcen = compute_pcen(chunk_audio, sr)
  File "C:\wc\record-nfc\.venv\lib\site-packages\birdvoxdetect\core.py", line 746, in compute_pcen
    librosa.util.valid_audio(audio, mono=True)
  File "C:\wc\record-nfc\.venv\lib\site-packages\librosa\util\utils.py", line 164, in valid_audio
    'ndim={:d}, shape={}'.format(y.ndim, y.shape))
librosa.util.exceptions.ParameterError: Invalid shape for monophonic audio: ndim=2, shape=(5221248, 2)```
lostanlen commented 4 years ago

Thank you for reporting. BirdVoxDetect does not support multichannel outputs. Please convert to mono ahead of calling birdvoxdetect.process_file

justinsalamon commented 4 years ago

@lostanlen could we not downmix the audio in BVD ourselves, to save the user having to preprocess it manually?

IINM we'd have to update 3 lines in core.py, basically where we call

chunk_audio = sound_file.read(...)

We update to:

chunk_audio = sound_file.read(...)
chunk_audio = chunk_audio.mean(axis=1)

Thoughts?