pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.49k stars 644 forks source link

Segmentation Fault #3453

Closed zanussbaum closed 1 year ago

zanussbaum commented 1 year ago

🐛 Describe the bug

import torchaudio
waveform, sr = torchaudio.load('audio.wav')

returns Segmentation fault (core dumped)

Running gdb --args python -c "import torchaudio; torchaudio.load('.assets/bird_audio.wav')" shows the following stack track

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffec0761884 in ?? () from /lib/x86_64-linux-gnu/libavfilter.so.7
(gdb) bt
#0  0x00007ffec0761884 in ?? () from /lib/x86_64-linux-gnu/libavfilter.so.7
#1  0x00007ffec073a7fd in ?? () from /lib/x86_64-linux-gnu/libavfilter.so.7
#2  0x00007ffec07402e3 in av_buffersrc_add_frame_flags () from /lib/x86_64-linux-gnu/libavfilter.so.7
#3  0x00007fff694bf069 in torchaudio::io::detail::(anonymous namespace)::ProcessImpl<torchaudio::io::AudioConverter<(c10::ScalarType)6, true>, torchaudio::io::detail::UnchunkedBuffer>::process_frame(AVFrame*) ()
   from /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so
#4  0x00007fff694c35ad in torchaudio::io::StreamProcessor::send_frame(AVFrame*) ()
   from /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so
#5  0x00007fff694c3639 in torchaudio::io::StreamProcessor::process_packet(AVPacket*) ()
   from /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so
#6  0x00007fff694c55bb in torchaudio::io::StreamReader::process_packet() ()
   from /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so
#7  0x00007fff694c5770 in torchaudio::io::StreamReader::process_all_packets() ()
   from /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so
#8  0x00007fff694df178 in torchaudio::io::(anonymous namespace)::_load_audio(torchaudio::io::StreamReader&, int, c10::optional<std::string> const&, bool const&) () from /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so
#9  0x00007fff694df470 in torchaudio::io::(anonymous namespace)::load(std::string const&, c10::optional<std::string> const&, c10::optional<std::string> const&, bool const&) () from /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so
#10 0x00007fff694e2ac9 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<std::tuple<at::Tensor, long> (*)(std::string const&, c10::optional<std::string> const&, c10::optional<std::string> const&, bool const&), std::tuple<at::Tensor, long>, c10::guts::typelist::typelist<std::string const&, c10::optional<std::string> const&, c10::optional<std::string> const&, bool const&> > >::type::return_type>::type c10::impl::call_functor_with_args_from_stack_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<std::tuple<at::Tensor, long> (*)(std::string const&, c10::optional<std::string> const&, c10::optional<std::string> const&, bool const&), std::tuple<at::Tensor, long>, c10::guts::typelist::typelist<std::string const&, c10::optional<std::string> const&, c10::optional<std::string> const&, bool const&> >, false, 0ul, 1ul, 2ul, 3ul, std::string const&, c10::optional<std::string> const&, c10::optional<std::string> const&, bool const&>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*, std::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul>, c10::guts::typelist::typelist<std::string const&, c10::optional<std::string> const&, c10::optional<std::string> const&, bool const&>*) () from /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so
#11 0x00007fff694e354b in c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<std::tuple<at::Tensor, long> (*)(std::string const&, c10::optional<std::string> const&, c10::optional<std::string> const&, bool const&), std::tuple<at::Tensor, long>, c10::guts::typelist::typelist<std::string const&, c10::optional<std::string> const&, c10::optional<std::string> const&, bool const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) () from /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so
#12 0x00007fffc9119742 in c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const () from /home/paperspace/ib/env/lib/python3.10/site-packages/torch/lib/libtorch_python.so
#13 0x00007fffc8ecef53 in torch::jit::invokeOperatorFromPython(std::vector<std::shared_ptr<torch::jit::Operator>, std::allocator<std::shared_ptr<torch::jit::Operator> > > const&, pybind11::args, pybind11::kwargs const&, c10::optional<c10::DispatchKey>) ()
   from /home/paperspace/ib/env/lib/python3.10/site-packages/torch/lib/libtorch_python.so
#14 0x00007fffc8ecf848 in torch::jit::_get_operation_for_overload_or_packet(std::vector<std::shared_ptr<torch::jit::Operator>, std::allocator<std::shared_ptr<torch::jit::Operator> > > const&, c10::Symbol, pybind11::args, pybind11::kwargs const&, bool, c10::optional<c10::DispatchKey>) () from /home/paperspace/ib/env/lib/python3.10/site-packages/torch/lib/libtorch_python.so
#15 0x00007fffc8dbd432 in pybind11::cpp_function::initialize<torch::jit::initJITBindings(_object*)::{lambda(std::string const&)#194}::operator()(std::string const&) const::{lambda(pybind11::args, pybind11::kwargs)#1}, pybind11::object, pybind11::args, pybind11::kwargs, pybind11::name, pybind11::doc>(torch::jit::initJITBindings(_object*)::{lambda(std::string const&)#194}::operator()(std::string const&) const::{lambda(pybind11::args, pybind11::kwargs)#1}&&, pybind11::object (*)(pybind11::args, pybind11::kwargs), pybind11::name const&, pybind11::doc const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) ()
   from /home/paperspace/ib/env/lib/python3.10/site-packages/torch/lib/libtorch_python.so
#16 0x00007fffc89da855 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) ()
--Type <RET> for more, q to quit, c to continue without paging--
   from /home/paperspace/ib/env/lib/python3.10/site-packages/torch/lib/libtorch_python.so
#17 0x00005555556b0c9e in cfunction_call (func=<built-in method compat_load of PyCapsule object at remote 0x7ffff7b07c00>, 
    args=<optimized out>, kwargs=<optimized out>) at ../Objects/methodobject.c:543
#18 0x00005555556bfb5b in _PyObject_Call (kwargs=<optimized out>, 
    args=('.assets/bird_audio.wav', None, 'aformat=sample_fmts=fltp', True), 
    callable=<built-in method compat_load of PyCapsule object at remote 0x7ffff7b07c00>, tstate=0x555555b5c4c0) at ../Objects/call.c:305
#19 PyObject_Call (callable=<built-in method compat_load of PyCapsule object at remote 0x7ffff7b07c00>, 
    args=('.assets/bird_audio.wav', None, 'aformat=sample_fmts=fltp', True), kwargs=<optimized out>) at ../Objects/call.c:317
#20 0x000055555569bd87 in do_call_core (kwdict={}, callargs=('.assets/bird_audio.wav', None, 'aformat=sample_fmts=fltp', True), 
    func=<built-in method compat_load of PyCapsule object at remote 0x7ffff7b07c00>, trace_info=0x7fffffffd590, tstate=<optimized out>)
    at ../Python/ceval.c:5943
#21 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at ../Python/ceval.c:4277
#22 0x00005555556a68b4 in _PyEval_EvalFrame (throwflag=0, 
    f=Frame 0x7ffec1b437f0, for file /home/paperspace/ib/env/lib/python3.10/site-packages/torch/_ops.py, line 677, in __call__ (self=<OpOverloadPacket(_qualified_op_name='torchaudio::compat_load', __name__='compat_load', _op=<built-in method compat_load of PyCapsule object at remote 0x7ffff7b07c00>, _overload_names=[''], _dir=[], __module__='torch._ops.torchaudio') at remote 0x7ffff7a72ef0>, args=('.assets/bird_audio.wav', None, 'aformat=sample_fmts=fltp', True), kwargs={}), tstate=0x555555b5c4c0) at ../Include/internal/pycore_ceval.h:46
#23 _PyEval_Vector (kwnames=0x0, argcount=<optimized out>, args=<optimized out>, locals=0x0, con=0x7fff645c17f0, tstate=0x555555b5c4c0)
    at ../Python/ceval.c:5065
#24 _PyFunction_Vectorcall (kwnames=0x0, nargsf=<optimized out>, stack=<optimized out>, func=<function at remote 0x7fff645c17e0>)
    at ../Objects/call.c:342
#25 _PyObject_FastCallDictTstate (tstate=0x555555b5c4c0, callable=<function at remote 0x7fff645c17e0>, args=<optimized out>, 
    nargsf=<optimized out>, kwargs=<optimized out>) at ../Objects/call.c:142
#26 0x00005555556bbf9c in _PyObject_Call_Prepend (tstate=0x555555b5c4c0, callable=<function at remote 0x7fff645c17e0>, 
    obj=<optimized out>, args=<optimized out>, kwargs=0x0) at ../Objects/call.c:431
#27 0x00005555557d9050 in slot_tp_call (
    self=<OpOverloadPacket(_qualified_op_name='torchaudio::compat_load', __name__='compat_load', _op=<built-in method compat_load of PyCapsule object at remote 0x7ffff7b07c00>, _overload_names=[''], _dir=[], __module__='torch._ops.torchaudio') at remote 0x7ffff7a72ef0>, 
    args=('.assets/bird_audio.wav', None, 'aformat=sample_fmts=fltp', True), kwds=0x0) at ../Objects/typeobject.c:7494
#28 0x00005555556a772b in _PyObject_MakeTpCall (tstate=0x555555b5c4c0, 
    callable=<OpOverloadPacket(_qualified_op_name='torchaudio::compat_load', __name__='compat_load', _op=<built-in method compat_load of PyCapsule object at remote 0x7ffff7b07c00>, _overload_names=[''], _dir=[], __module__='torch._ops.torchaudio') at remote 0x7ffff7a72ef0>, 
    args=<optimized out>, nargs=<optimized out>, keywords=0x0) at ../Objects/call.c:215
#29 0x00005555556a00e7 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=<optimized out>, 
    callable=<OpOverloadPacket(_qualified_op_name='torchaudio::compat_load', __name__='compat_load', _op=<built-in method compat_load of PyCapsule object at remote 0x7ffff7b07c00>, _overload_names=[''], _dir=[], __module__='torch._ops.torchaudio') at remote 0x7ffff7a72ef0>, 
    tstate=<optimized out>) at ../Include/cpython/abstract.h:112
#30 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7ffeb44a95e8, 
    callable=<OpOverloadPacket(_qualified_op_name='torchaudio::compat_load', __name__='compat_load', _op=<built-in method compat_load of PyCapsule object at remote 0x7ffff7b07c00>, _overload_names=[''], _dir=[], __module__='torch._ops.torchaudio') at remote 0x7ffff7a72ef0>, 
    tstate=<optimized out>) at ../Include/cpython/abstract.h:99
#31 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7ffeb44a95e8, 
    callable=<OpOverloadPacket(_qualified_op_name='torchaudio::compat_load', __name__='compat_load', _op=<built-in method compat_load of PyCapsule object at remote 0x7ffff7b07c00>, _overload_names=[''], _dir=[], __module__='torch._ops.torchaudio') at remote 0x7ffff7a72ef0>)
    at ../Include/cpython/abstract.h:123
#32 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, trace_info=0x7fffffffd8c0, tstate=<optimized out>)
    at ../Python/ceval.c:5891
#33 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at ../Python/ceval.c:4181
--Type <RET> for more, q to quit, c to continue without paging--
#34 0x00005555556b14ec in _PyEval_EvalFrame (throwflag=0, 
    f=Frame 0x7ffeb44a9440, for file /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/io/_compat.py, line 93, in load_audio (src='.assets/bird_audio.wav', frame_offset=0, num_frames=-1, convert=True, channels_first=True, format=None, filter='aformat=sample_fmts=fltp'), tstate=0x555555b5c4c0) at ../Include/internal/pycore_ceval.h:46
#35 _PyEval_Vector (kwnames=<optimized out>, argcount=<optimized out>, args=<optimized out>, locals=0x0, con=0x7ffeb4513ad0, 
    tstate=0x555555b5c4c0) at ../Python/ceval.c:5065
#36 _PyFunction_Vectorcall (func=<function at remote 0x7ffeb4513ac0>, stack=<optimized out>, nargsf=<optimized out>, 
    kwnames=<optimized out>) at ../Objects/call.c:342
#37 0x0000555555699a1d in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7ffec16fb250, 
    callable=<function at remote 0x7ffeb4513ac0>, tstate=0x555555b5c4c0) at ../Include/cpython/abstract.h:114
#38 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7ffec16fb250, callable=<function at remote 0x7ffeb4513ac0>)
    at ../Include/cpython/abstract.h:123
#39 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, trace_info=0x7fffffffda90, tstate=<optimized out>)
    at ../Python/ceval.c:5891
#40 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at ../Python/ceval.c:4213
#41 0x00005555556b14ec in _PyEval_EvalFrame (throwflag=0, 
    f=Frame 0x7ffec16fb0b0, for file /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/_backend/utils.py, line 111, in load (uri='.assets/bird_audio.wav', frame_offset=0, num_frames=-1, normalize=True, channels_first=True, format=None, buffer_size=4096), 
    tstate=0x555555b5c4c0) at ../Include/internal/pycore_ceval.h:46
#42 _PyEval_Vector (kwnames=<optimized out>, argcount=<optimized out>, args=<optimized out>, locals=0x0, con=0x7ffeb4488b90, 
    tstate=0x555555b5c4c0) at ../Python/ceval.c:5065
#43 _PyFunction_Vectorcall (func=<function at remote 0x7ffeb4488b80>, stack=<optimized out>, nargsf=<optimized out>, 
    kwnames=<optimized out>) at ../Objects/call.c:342
#44 0x000055555569f75a in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x55555af23bb8, 
    callable=<function at remote 0x7ffeb4488b80>, tstate=0x555555b5c4c0) at ../Include/cpython/abstract.h:114
#45 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x55555af23bb8, callable=<function at remote 0x7ffeb4488b80>)
    at ../Include/cpython/abstract.h:123
#46 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, trace_info=0x7fffffffdc60, tstate=<optimized out>)
    at ../Python/ceval.c:5891
#47 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at ../Python/ceval.c:4181
#48 0x00005555556b14ec in _PyEval_EvalFrame (throwflag=0, 
    f=Frame 0x55555af23a00, for file /home/paperspace/ib/env/lib/python3.10/site-packages/torchaudio/_backend/utils.py, line 446, in load (uri='.assets/bird_audio.wav', frame_offset=0, num_frames=-1, normalize=True, channels_first=True, format=None, buffer_size=4096, backend=<ABCMeta(__module__='torchaudio._backend.utils', info=<staticmethod at remote 0x7ffeb45252d0>, load=<staticmethod at remote 0x7ffeb4525300>, save=<staticmethod at remote 0x7ffeb4525330>, can_decode=<staticmethod at remote 0x7ffeb4525360>, can_encode=<staticmethod at remote 0x7ffeb4525390>, __doc__=None, __abstractmethods__=frozenset(), _abc_impl=<_abc._abc_data at remote 0x7ffec2236140>) at remote 0x55555afb4770>), tstate=0x555555b5c4c0) at ../Include/internal/pycore_ceval.h:46
#49 _PyEval_Vector (kwnames=<optimized out>, argcount=<optimized out>, args=<optimized out>, locals=0x0, con=0x7ffeb44899a0, 
    tstate=0x555555b5c4c0) at ../Python/ceval.c:5065
#50 _PyFunction_Vectorcall (func=<function at remote 0x7ffeb4489990>, stack=<optimized out>, nargsf=<optimized out>, 
    kwnames=<optimized out>) at ../Objects/call.c:342
#51 0x000055555569f75a in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7ffff7b2e190, 
    callable=<function at remote 0x7ffeb4489990>, tstate=0x555555b5c4c0) at ../Include/cpython/abstract.h:114
#52 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7ffff7b2e190, callable=<function at remote 0x7ffeb4489990>)
    at ../Include/cpython/abstract.h:123
#53 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, trace_info=0x7fffffffde30, tstate=<optimized out>)
    at ../Python/ceval.c:5891
#54 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at ../Python/ceval.c:4181
--Type <RET> for more, q to quit, c to continue without paging--
#55 0x0000555555696176 in _PyEval_EvalFrame (throwflag=0, f=Frame 0x7ffff7b2e020, for file <string>, line 1, in <module> (), 
    tstate=0x555555b5c4c0) at ../Include/internal/pycore_ceval.h:46
#56 _PyEval_Vector (tstate=0x555555b5c4c0, con=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, 
    kwnames=<optimized out>) at ../Python/ceval.c:5065
#57 0x000055555578bc56 in PyEval_EvalCode (co=<code at remote 0x7ffff79db3c0>, 
    globals={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <type at remote 0x555555b6de00>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff7b905e0>, 'torchaudio': <module at remote 0x7ffff7a764d0>}, 
    locals=<optimized out>) at ../Python/ceval.c:1134
#58 0x00005555557b8b18 in run_eval_code_obj (tstate=0x555555b5c4c0, co=0x7ffff79db3c0, 
    globals={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <type at remote 0x555555b6de00>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff7b905e0>, 'torchaudio': <module at remote 0x7ffff7a764d0>}, 
    locals={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <type at remote 0x555555b6de00>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff7b905e0>, 'torchaudio': <module at remote 0x7ffff7a764d0>})
    at ../Python/pythonrun.c:1291
#59 0x00005555557b196b in run_mod (mod=<optimized out>, filename=<optimized out>, 
    globals={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <type at remote 0x555555b6de00>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff7b905e0>, 'torchaudio': <module at remote 0x7ffff7a764d0>}, 
    locals={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <type at remote 0x555555b6de00>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff7b905e0>, 'torchaudio': <module at remote 0x7ffff7a764d0>}, 
    flags=<optimized out>, arena=<optimized out>) at ../Python/pythonrun.c:1312
#60 0x00005555557a9f21 in PyRun_StringFlags (str=<optimized out>, start=257, 
    globals={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <type at remote 0x555555b6de00>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff7b905e0>, 'torchaudio': <module at remote 0x7ffff7a764d0>}, 
    locals={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <type at remote 0x555555b6de00>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff7b905e0>, 'torchaudio': <module at remote 0x7ffff7a764d0>}, 
    flags=0x7fffffffe050) at ../Python/pythonrun.c:1183
#61 0x00005555557a9dd1 in PyRun_SimpleStringFlags (
    command=0x7ffff7a050d0 "import torchaudio; torchaudio.load('.assets/bird_audio.wav')\n", flags=0x7fffffffe050)
    at ../Python/pythonrun.c:503
#62 0x00005555557a8cf5 in pymain_run_command (command=<optimized out>) at ../Modules/main.c:248
#63 pymain_run_python (exitcode=0x7fffffffe044) at ../Modules/main.c:578
#64 Py_RunMain () at ../Modules/main.c:666
#65 0x000055555577ebcd in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at ../Modules/main.c:720
#66 0x00007ffff7c29d90 in __libc_start_call_main (main=main@entry=0x55555577eb90 <main>, argc=argc@entry=3, 
    argv=argv@entry=0x7fffffffe258) at ../sysdeps/nptl/libc_start_call_main.h:58
#67 0x00007ffff7c29e40 in __libc_start_main_impl (main=0x55555577eb90 <main>, argc=3, argv=0x7fffffffe258, init=<optimized out>, 
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe248) at ../csu/libc-start.c:392
#68 0x000055555577eac5 in _start ()

Versions

Collecting environment information...
PyTorch version: 2.1.0.dev20230703+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.2 LTS (x86_64)
GCC version: (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35

Python version: 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0] (64-bit runtime)
Python platform: Linux-5.19.0-45-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
GPU 0: NVIDIA H100 80GB HBM3
GPU 1: NVIDIA H100 80GB HBM3
GPU 2: NVIDIA H100 80GB HBM3
GPU 3: NVIDIA H100 80GB HBM3
GPU 4: NVIDIA H100 80GB HBM3
GPU 5: NVIDIA H100 80GB HBM3
GPU 6: NVIDIA H100 80GB HBM3
GPU 7: NVIDIA H100 80GB HBM3

Nvidia driver version: 530.30.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Address sizes:                   52 bits physical, 57 bits virtual
Byte Order:                      Little Endian
CPU(s):                          192
On-line CPU(s) list:             0-191
Vendor ID:                       AuthenticAMD
Model name:                      AMD EPYC 9474F 48-Core Processor
CPU family:                      25
Model:                           17
Thread(s) per core:              2
Core(s) per socket:              48
Socket(s):                       2
Stepping:                        1
Frequency boost:                 enabled
CPU max MHz:                     4113.2808
CPU min MHz:                     1500.0000
BogoMIPS:                        7199.97
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 invpcid_single hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq la57 rdpid overflow_recov succor smca fsrm flush_l1d
Virtualization:                  AMD-V
L1d cache:                       3 MiB (96 instances)
L1i cache:                       3 MiB (96 instances)
L2 cache:                        96 MiB (96 instances)
L3 cache:                        512 MiB (16 instances)
NUMA node(s):                    2
NUMA node0 CPU(s):               0-47,96-143
NUMA node1 CPU(s):               48-95,144-191
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Mmio stale data:   Not affected
Vulnerability Retbleed:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected

Versions of relevant libraries:
[pip3] numpy==1.24.1
[pip3] pytorch-triton==2.1.0+440fd1bf20
[pip3] pytorchvideo==0.1.5
[pip3] torch==2.1.0.dev20230703+cu121
[pip3] torchaudio==2.1.0.dev20230703+cu121
[pip3] torchvision==0.16.0.dev20230703+cu121
[conda] Could not collect

Here is my ffmpeg version also as it seems that in #3411 that it needs to be < 5

ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Hyper fast Audio and Video encoder
usage: ffmpeg [options] [[infile options] -i infile]... {[outfile options] outfile}...
mthrok commented 1 year ago

Hi @zanussbaum

This looks like data-dependent issue. The code path is tested in many places, and I have never seen this. Any chance you can share the data?

zanussbaum commented 1 year ago

@mthrok Hm that's weird. I'm able to load the same file in torchaudio when using 1.13 but not on this latest version.

I've tested it by running the example in ImageBind and I am able to get a forward pass to work with 1.13 but not 2.1 for the following code

import data
import torch
from models import imagebind_model
from models.imagebind_model import ModalityType

text_list=["A dog.", "A car", "A bird"]
image_paths=[".assets/dog_image.jpg", ".assets/car_image.jpg", ".assets/bird_image.jpg"]
audio_paths=[".assets/dog_audio.wav", ".assets/car_audio.wav", ".assets/bird_audio.wav"]

device = "cuda:0" if torch.cuda.is_available() else "cpu"

# Instantiate model
model = imagebind_model.imagebind_huge(pretrained=True)
model.eval()
model.to(device)

# Load data
inputs = {
    ModalityType.TEXT: data.load_and_transform_text(text_list, device),
    ModalityType.VISION: data.load_and_transform_vision_data(image_paths, device),
    ModalityType.AUDIO: data.load_and_transform_audio_data(audio_paths, device),
}

with torch.no_grad():
    embeddings = model(inputs)

print(
    "Vision x Text: ",
    torch.softmax(embeddings[ModalityType.VISION] @ embeddings[ModalityType.TEXT].T, dim=-1),
)
print(
    "Audio x Text: ",
    torch.softmax(embeddings[ModalityType.AUDIO] @ embeddings[ModalityType.TEXT].T, dim=-1),
)
print(
    "Vision x Audio: ",
    torch.softmax(embeddings[ModalityType.VISION] @ embeddings[ModalityType.AUDIO].T, dim=-1),
)

Although I should note that to get this working with torchaudio 1.13, I had to switch machines as the H100s (IIUC) use a different instruction set and require torch 2.0.0 > to use the GPUs effectively. But I was using the same audio file in the repo above

zanussbaum commented 1 year ago

Hm this now seems to work with

Versions of relevant libraries:
[pip3] numpy==1.24.1
[pip3] pytorch-triton==2.1.0+440fd1bf20
[pip3] torch==2.1.0.dev20230709+cu121
[pip3] torchaudio==2.1.0.dev20230709+cu121
[pip3] torchvision==0.16.0.dev20230709+cu121
[conda] Could not collect

Closing as this seems to be resolved

philgzl commented 1 year ago

This needs to be reopened. Facing the same issue on multiple systems, even with the versions mentioned above. Here is one my files: 00000_mixture.flac.tar.gz

Both torchaudio.load and torchaudio.info cause the segmentation fault:

import torchaudio
torchaudio.info('00000_mixture.flac')  # Segmentation fault

I have no problem with earlier nightly versions, e.g. this works fine:

torch==2.1.0.dev20230508+cu121
torchaudio==2.1.0.dev20230508+cu121
torchvision==0.16.0.dev20230508+cu121

I did not try and find which nightly version introduced the bug though.

philgzl commented 1 year ago

Mmh nevermind, latest nightly builds works for me. Weird.