pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.48k stars 641 forks source link

torchaudio.load() for file-like object fails for mp3 files #2363

Closed rbracco closed 2 years ago

rbracco commented 2 years ago

πŸ› Describe the bug

Description

This error occurs when trying to read a file-like object that contains an MP3 audio. This error does not occur for file-like objects that contain WAV audio.

Stack Trace

formats: can't determine type of file `'
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/home/rob/code/english_toolkit/.venv/lib/python3.8/site-packages/torchaudio/backend/sox_io_backend.py", line 149, in load
    return torchaudio._torchaudio.load_audio_fileobj(
RuntimeError: Error loading audio file: failed to open file <in memory buffer>

Reproducible Snippets

To reproduce with MP3:

with requests.get("https://filesamples.com/samples/audio/mp3/sample3.mp3", stream=True) as response:
     y,sr = torchaudio.load(response.raw)

To verify this is not an issue for WAV"

with requests.get("https://www2.cs.uic.edu/~i101/SoundFiles/gettysburg10.wav", stream=True) as response:
     y,sr = torchaudio.load(response.raw)

Relevant Documentation

These snippets copy exactly the torchaudio load filelike object documentation image

Versions

torchaudio version: 0.11.0

mthrok commented 2 years ago

Hi @rbracco

To use MP3 with file-like object, you need to pass format="mp3" argument.

rbracco commented 2 years ago

Thank you!

nicemanis commented 2 years ago

I'm having trouble loading mp3 even when specifying the format:

Python 3.8.13 (default, Mar 28 2022, 11:38:47)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchaudio
>>> torchaudio.get_audio_backend()
'sox_io'
>>> wav, sr = torchaudio.load("test.mp3", format="mp3")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/path/wav2vec/lib/python3.8/site-packages/torchaudio/backend/sox_io_backend.py", line 227, in load
    return _fallback_load(filepath, frame_offset, num_frames, normalize, channels_first, format)
  File "/path/wav2vec/lib/python3.8/site-packages/torchaudio/backend/sox_io_backend.py", line 29, in _fail_load
    raise RuntimeError("Failed to load audio from {}".format(filepath))
RuntimeError: Failed to load audio from test.mp3
>>>
rbracco commented 2 years ago

My first question would be, where did "test.mp3" come from, how certain are you it's a properly formatted MP3, and can you share it for testing?

On Fri, Jul 8, 2022 at 6:44 AM Dāvis Nicmanis @.***> wrote:

I'm having trouble loading mp3 even when specifying the format:

Python 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information.

import torchaudio torchaudio.get_audio_backend() 'sox_io' wav, sr = torchaudio.load("test.mp3", format="mp3") Traceback (most recent call last): File "", line 1, in File "/path/wav2vec/lib/python3.8/site-packages/torchaudio/backend/sox_io_backend.py", line 227, in load return _fallback_load(filepath, frame_offset, num_frames, normalize, channels_first, format) File "/path/wav2vec/lib/python3.8/site-packages/torchaudio/backend/sox_io_backend.py", line 29, in _fail_load raise RuntimeError("Failed to load audio from {}".format(filepath)) RuntimeError: Failed to load audio from test.mp3

β€” Reply to this email directly, view it on GitHub https://github.com/pytorch/audio/issues/2363#issuecomment-1178839099, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALIBGAI4UB5EZIZGX6KI53LVTABA5ANCNFSM5VFEOX7Q . You are receiving this because you modified the open/close state.Message ID: @.***>

nicemanis commented 2 years ago

It's just a random mp3 file, I tested various files to verify that it's not the issue with the file.

mthrok commented 2 years ago

Starting 0.12, the MP3 decoder is switched to libavcodec. The error indicates that there is no installation of FFmpeg 4. Please install ffmpeg 4. conda install 'ffmpeg<5' should do in most cases.

nicemanis commented 2 years ago

Thank you, this solved the issue

lukeellison commented 2 years ago

I can't seem to get it to load any mp3 files with ffmpeg version 4.4.2. I'm not using conda but maybe I have to install it to get torchaudio v0.13.0.dev20220719 working with mp3s? I have a feeling mp3s are just not supported because the output of torchaudio.utils.sox_utils.list_read_formats() does not include 'mp3' if that's relevant:

Actual output: ``` ['aifc', 'aiffc', 'aiff', 'aif', 'al', 'au', 'snd', 'avr', 'cdda', 'cdr', 'cvsd', 'cvs', 'cvu', 'dat', 'dvms', 'vms', 'f4', 'f32', 'f8', 'f64', 'gsrt', 'hcom', 'htk', 'ima', 'la', 'lu', 'maud', 'null', 'prc', 'raw', 's1', 's8', 'sb', 's2', 's16', 'sw', 's3', 's24', 's4', 's32', 'sl', 'sf', 'ircam', 'sln', 'smp', 'sndr', 'sndt', 'sox', 'sph', 'nist', '8svx', 'txw', 'u1', 'u8', 'ub', 'sou', 'fssd', 'u2', 'u16', 'uw', 'u3', 'u24', 'u4', 'u32', 'ul', 'voc', 'vox', 'wav', 'wavpcm', 'amb', 'wve', 'xa', 'amr-nb', 'anb', 'amr-wb', 'awb', 'flac', 'gsm', 'lpc10', 'lpc', 'opus', 'vorbis', 'ogg'] ```
mthrok commented 2 years ago

FFmpeg 4.4 has different ABI, so C runtime does not pick it up. Can you try 4.1 - 4.3?

lukeellison commented 2 years ago

Still no joy. Here's the more specific output below. This is on a mac and in a notebook if that helps.

import subprocess
subprocess.check_output(['ffmpeg', '-version'])
Output: ``` b'ffmpeg version 4.3.2-tessus https://evermeet.cx/ffmpeg/ Copyright (c) 2000-2021 the FFmpeg developers\nbuilt with Apple clang version 11.0.0 (clang-1100.0.33.17)\nconfiguration: --cc=/usr/bin/clang --prefix=/opt/ffmpeg --extra-version=tessus --enable-avisynth --enable-fontconfig --enable-gpl --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libfreetype --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopus --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvmaf --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-version3 --pkg-config-flags=--static --disable-ffplay\nlibavutil 56. 51.100 / 56. 51.100\nlibavcodec 58. 91.100 / 58. 91.100\nlibavformat 58. 45.100 / 58. 45.100\nlibavdevice 58. 10.100 / 58. 10.100\nlibavfilter 7. 85.100 / 7. 85.100\nlibswscale 5. 7.100 / 5. 7.100\nlibswresample 3. 7.100 / 3. 7.100\nlibpostproc 55. 7.100 / 55. 7.100\n' ```
import torchaudio

SPEECH_FILE = "_cv_corpus/en/clips/common_voice_en_27988632.mp3"

waveform, sample_rate = torchaudio.load(SPEECH_FILE, format="mp3")
Output: ``` 0.13.0.dev20220719 --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) /var/folders/n9/c_vk64v10sxbms738l3v_sk00000gp/T/ipykernel_99033/3276359229.py in 4 SPEECH_FILE = "_cv_corpus/en/clips/common_voice_en_27988632.mp3" 5 ----> 6 waveform, sample_rate = torchaudio.load(SPEECH_FILE, format="mp3") ~/.../lib/python3.7/site-packages/torchaudio/backend/sox_io_backend.py in load(filepath, frame_offset, num_frames, normalize, channels_first, format) 225 if ret is not None: 226 return ret --> 227 return _fallback_load(filepath, frame_offset, num_frames, normalize, channels_first, format) 228 229 ~/.../lib/python3.7/site-packages/torchaudio/backend/sox_io_backend.py in _fail_load(filepath, frame_offset, num_frames, normalize, channels_first, format) 27 format: Optional[str] = None, 28 ) -> Tuple[torch.Tensor, int]: ---> 29 raise RuntimeError("Failed to load audio from {}".format(filepath)) 30 31 RuntimeError: Failed to load audio from _cv_corpus/en/clips/common_voice_en_27988632.mp3 ```
mthrok commented 2 years ago

@lukeellison Looks like the ffmpeg binary you have is static built. TorchAudio requires the shared libraries of libavXXX, such as libavutils.56.22.100.so, and TorchAudio does not require ffmpeg command itself.

lukeellison commented 2 years ago

Right, okay. Doesn't look like there's a particularly easy way to install ffmpeg version 4.1 - 4.3 on mac from my research other than installing the static file from here: https://evermeet.cx/ffmpeg/ . Do you know a way to install it with the dependant shared libraries? Maybe it's time to install conda instead. Seems like a bit of a limitation of torchaudio to me though if it requires conda to run. Thank you for your time.

alireza-hariri commented 2 years ago

It is good to show relevant error message instead of random runtime err anyway installing ffmpeg 4.3 and libavutil-dev package solve the problem for me

lukasschmit commented 1 year ago

yah confirming conda install ffmpeg=4.3 -c conda-forge fixed this for me

drscotthawley commented 1 year ago

Is there an easy to get this without conda? e.g. conda's not on Colab by default, but pip is. pip's ffmpeg only goes up to v1.4. Installing via apt on Colab gets us v.3.4:

! apt install ffmpeg

Reading package lists... Done
Building dependency tree       
Reading state information... Done
ffmpeg is already the newest version (7:3.4.11-0ubuntu0.1).
The following package was automatically installed and is no longer required:
  libnvidia-common-460
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 12 not upgraded.
y = torchaudio.load('my_audio.mp3', format="mp3")
1 frames
[/usr/local/lib/python3.7/dist-packages/torchaudio/backend/sox_io_backend.py](https://localhost:8080/#) in load(filepath, frame_offset, num_frames, normalize, channels_first, format)
    225     if ret is not None:
    226         return ret
--> 227     return _fallback_load(filepath, frame_offset, num_frames, normalize, channels_first, format)
    228 
    229 

[/usr/local/lib/python3.7/dist-packages/torchaudio/backend/sox_io_backend.py](https://localhost:8080/#) in _fail_load(filepath, frame_offset, num_frames, normalize, channels_first, format)
     27     format: Optional[str] = None,
     28 ) -> Tuple[torch.Tensor, int]:
---> 29     raise RuntimeError("Failed to load audio from {}".format(filepath))
     30 
     31 

RuntimeError: Failed to load audio from /content/my_audio.mp3
drscotthawley commented 1 year ago

...Follow up: Nope. Even after installing conda on Colab, still getting the same error.

%%bash 
MINICONDA_INSTALLER_SCRIPT=Miniconda3-4.5.4-Linux-x86_64.sh
MINICONDA_PREFIX=/usr/local
wget https://repo.continuum.io/miniconda/$MINICONDA_INSTALLER_SCRIPT
chmod +x $MINICONDA_INSTALLER_SCRIPT
./$MINICONDA_INSTALLER_SCRIPT -b -f -p $MINICONDA_PREFIX
conda update -n base conda
yes | conda install ffmpeg=4.3 -c conda-forge
yes | conda uninstall torch torchvision torchaudio 
yes | conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

...etc...

even after restarting the runtime and coming back, same error persists.

So, ...

Solution

Forget torchaudio. Just do a

!sudo apt install ffmpeg

then

import librosa

load with librosa and recast to torch.tensor() βœ…

mthrok commented 1 year ago

On Google Colab, the following should install a supported FFmpeg.

β€˜β€™β€™ !add-apt-repository -y ppa:savoury1/ffmpeg4 !apt-get -qq install -y ffmpeg β€˜β€™β€™

drscotthawley commented 1 year ago

What about on systems for which we don't have root access, and environments that use a pip (not conda) install? In such cases, I am able to use librosa and an earlier version of ffmpeg to read MP3s, but don't seem to be able to install a sufficiently high version of ffmpeg to get torchaudio to read MP3 files.

mthrok commented 1 year ago

For such environments, the availability of pre-built FFmpeg binary depends on the system. Some package managers such as brew on linux work without admin privilege. If that does not work for your system, the last resort is to install FFmpeg manually. You can, for example, refer to the GPU decoding tutorial to see what building FFmpeg looks like. Although it still requires codec libraries available somehow.

Some hack I use sometimes is to copy the libraries files from some other locations. This works if both environments are same. exodus seems to facilitate this but I never tried by myself.

Vedaad-Shakib commented 1 year ago

None of the previous advice was working for me (ffmpeg=4.3, torchaudio=0.12.1). I upgraded ffmpeg to version 5.1.2, and now I can load mp3 files.

CharlesSL commented 1 year ago

Solution:

make sure the ffmpeg codec libraries is linked to libtorchaudio

Details:

I spent two days to solve this problem, my situation is: ffmpeg: 5.1.1, torchaudio: 0.13.1, python: 3.9 I tried all advices above and did not work, then I searched in torchaudio official doc, and found this about the load function:

To load MP3, FLAC, OGG/VORBIS, OPUS and other codecs libsox does not handle natively, your installation of torchaudio has to be linked to libsox and corresponding codec libraries such as libmad or libmp3lame etc.

  1. try the ldd cmd: ldd ${python_lib}/dist-packages/torchaudio/lib/libtorchaudio_ffmpeg.so
  2. check if all libraries are linked to torchaudio
    libavdevice.so.58 => /usr/lib/x86_64-linux-gnu/libavdevice.so.58 (0x00007fb903e38000)
    libavfilter.so.7 => /usr/lib/x86_64-linux-gnu/libavfilter.so.7 (0x00007fb903796000)
    libavformat.so.58 => /usr/lib/x86_64-linux-gnu/libavformat.so.58 (0x00007fb9032ed000)
    libavcodec.so.58 => /usr/lib/x86_64-linux-gnu/libavcodec.so.58 (0x00007fb901ce7000)
    libavutil.so.56 => /usr/lib/x86_64-linux-gnu/libavutil.so.56 (0x00007fb901815000)
  3. if not, install with apt/yum apt install libavcodec-dev ...

    Question:

    though it works in torchaudio.load() function, i still failed to find 'mp3' in torchaudio.utils.sox_utils.list_read_formats() outputs:

    >>> import torchaudio
    >>> torchaudio.utils.sox_utils.list_read_formats()
    ['aifc', 'aiffc', 'aiff', 'aif', 'al', 'au', 'snd', 'avr', 'cdda', 'cdr', 'cvsd', 'cvs', 'cvu', 'dat', 'dvms', 'vms', 'f4', 'f32', 'f8', 'f64', 'gsrt', 'hcom', 'htk', 'ima', 'la', 'lu', 'maud', 'null', 'prc', 'raw', 's1', 's8', 'sb', 's2', 's16', 'sw', 's3', 's24', 's4', 's32', 'sl', 'sf', 'ircam', 'sln', 'smp', 'sndr', 'sndt', 'sox', 'sph', 'nist', '8svx', 'txw', 'u1', 'u8', 'ub', 'sou', 'fssd', 'u2', 'u16', 'uw', 'u3', 'u24', 'u4', 'u32', 'ul', 'voc', 'vox', 'wav', 'wavpcm', 'amb', 'wve', 'xa', 'amr-nb', 'anb', 'amr-wb', 'awb', 'flac', 'gsm', 'lpc10', 'lpc', 'opus', 'vorbis', 'ogg']
orlando-labs commented 1 year ago

For ffmpeg 4.2.8 (devel-package installed), I was able to achieve mp3 support on torchaudio 0.13.1 by explicit setting USE_FFMPEG=1 when building torchaudio from the sources

Joanna1212 commented 8 months ago

I have same problem(ffmpeg=4.3, torchaudio=0.12)。I found /lib64/libavcodec.so.57 under /lib64. but /data/home/yiljiang/anaconda3/envs/mert/bin/../lib/libavcodec.so.58 under conda env directory.

So I set

# use libavcodec.so.58 under /data/home/yl/anaconda3/envs/mert/bin/../lib/ first
export LD_LIBRARY_PATH=/data/home/yl/envs/mert/bin/../lib/:$LD_LIBRARY_PATH

And then problem solved

mittalpatel commented 6 months ago

I'm having trouble loading mp3 even when specifying the format:


Python 3.8.13 (default, Mar 28 2022, 11:38:47)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchaudio
>>> torchaudio.get_audio_backend()
'sox_io'
>>> wav, sr = torchaudio.load("test.mp3", format="mp3")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/path/wav2vec/lib/python3.8/site-packages/torchaudio/backend/sox_io_backend.py", line 227, in load
    return _fallback_load(filepath, frame_offset, num_frames, normalize, channels_first, format)
  File "/path/wav2vec/lib/python3.8/site-packages/torchaudio/backend/sox_io_backend.py", line 29, in _fail_load
    raise RuntimeError("Failed to load audio from {}".format(filepath))
RuntimeError: Failed to load audio from test.mp3
>>>
```'

install torchaudio >= 1.9.0

ArtemisZGL commented 6 months ago

Failed to load mp3 with ffmpeg==4.2.7 and torchaudio==2.0.1