juanmc2005 / diart

A python package to build AI-powered real-time audio applications
https://diart.readthedocs.io
MIT License
1.07k stars 88 forks source link

[joss] Outdated torchaudio can't find newer ffmpeg, conda environment.yml file missing to specify version #158

Closed sneakers-the-rat closed 1 year ago

sneakers-the-rat commented 1 year ago

Trying to run the benchmark, and I got an import error from torchaudio:

_______________________________________________ ERROR collecting tests/test_config.py ________________________________________________
ImportError while importing test module 'diart_fork/tests/test_config.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
venv/lib/python3.9/site-packages/torchaudio/_extension.py:71: in _init_ffmpeg
    _load_lib("libtorchaudio_ffmpeg")
venv/lib/python3.9/site-packages/torchaudio/_extension.py:52: in _load_lib
    torch.ops.load_library(path)
venv/lib/python3.9/site-packages/torch/_ops.py:573: in load_library
    ctypes.CDLL(path)
../../.pyenv/versions/3.9.1/lib/python3.9/ctypes/__init__.py:374: in __init__
    self._handle = _dlopen(self._name, mode)
E   OSError: dlopen(diart_fork/venv/lib/python3.9/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so, 0x0006): Library not loaded: '@rpath/libavdevice.58.dylib'
E     Referenced from: 'diart_fork/venv/lib/python3.9/site-packages/torchaudio/lib/libtorchaudio_ffmpeg.so'
E     Reason: tried: '/usr/local/lib/libavdevice.58.dylib' (no such file), '/usr/lib/libavdevice.58.dylib' (no such file)

The above exception was the direct cause of the following exception:
../../.pyenv/versions/3.9.1/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_config.py:7: in <module>
    from diart.console.benchmark import parse_args
src/diart/console/benchmark.py:9: in <module>
    from diart.inference import Benchmark, Parallelize
src/diart/inference.py:8: in <module>
    import diart.sources as src
src/diart/sources.py:11: in <module>
    from torchaudio.io import StreamReader
venv/lib/python3.9/site-packages/torchaudio/io/__init__.py:21: in __getattr__
    torchaudio._extension._init_ffmpeg()
venv/lib/python3.9/site-packages/torchaudio/_extension.py:73: in _init_ffmpeg
    raise ImportError("FFmpeg libraries are not found. Please install FFmpeg.") from err
E   ImportError: FFmpeg libraries are not found. Please install FFmpeg.

This apparently was also raised on stack overflow: https://stackoverflow.com/questions/76155851/diart-torchaudio-on-windows-x64-results-in-torchaudio-error-importerror-ffmp

I am not going to debug it right now, but this would have been immediately discovered in CI - the installation instructions are to manually create and populate a conda environment, and if the problem is indeed that the old version of torchaudio can't handle newer ffmpegs, then that would have been caught trying to install that way in CI because conda-forge's version is 6.0.0 at the moment: https://anaconda.org/conda-forge/ffmpeg

If the package depends on a conda environment, then I would expect an environment.yml file to be included with the package that specifies the appropriate dependencies.

Part of: https://github.com/openjournals/joss-reviews/issues/5266

juanmc2005 commented 1 year ago

@sneakers-the-rat thank you for reporting this

sneakers-the-rat commented 1 year ago

I think the fix is as simple as making a conda environment file, and I think torchaudio has an updated version?

To prevent future bugs like this, a very simple CI (eg. github actions) workflow that installs the package and runs the benchmark would at least ensure that the package runs

zaouk commented 1 year ago

@sneakers-the-rat on which platform did you try to run the benchmark? Is it windows too (as per the stackoverflow link that you shared)? If so, did you try to use diart on WSL instead?

I am using linux, and I can't seem to be able to reproduce the error that you mentioned. I created a new conda environment and followed the steps for installation (first running conda install portaudio pysoundfile ffmpeg -c conda-forge and then pip install diart) and everything works properly.

FWIW, I exported the environment I have into this environment yml file:

name: diart
channels:
  - conda-forge
  - defaults
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=2_kmp_llvm
  - alsa-lib=1.2.9=hd590300_0
  - bzip2=1.0.8=h7f98852_4
  - ca-certificates=2023.5.7=hbcca054_0
  - cffi=1.15.1=py38h4a40e3a_3
  - ffmpeg=4.2.2=h20bf706_0
  - freetype=2.10.4=h0708190_1
  - gettext=0.21.1=h27087fc_0
  - gmp=6.2.1=h58526e2_0
  - gnutls=3.6.13=h85f3911_1
  - lame=3.100=h166bdaf_1003
  - ld_impl_linux-64=2.38=h1181459_1
  - libblas=3.9.0=17_linux64_openblas
  - libcblas=3.9.0=17_linux64_openblas
  - libffi=3.4.4=h6a678d5_0
  - libflac=1.4.3=h59595ed_0
  - libgcc-ng=13.1.0=he5830b7_0
  - libgfortran-ng=13.1.0=h69a702a_0
  - libgfortran5=13.1.0=h15d22d2_0
  - liblapack=3.9.0=17_linux64_openblas
  - libogg=1.3.4=h7f98852_1
  - libopenblas=0.3.23=pthreads_h80387f5_0
  - libopus=1.3.1=h7f98852_1
  - libpng=1.6.39=h5eee18b_0
  - libsndfile=1.2.0=hb75c966_0
  - libstdcxx-ng=13.1.0=hfd8a6a1_0
  - libvorbis=1.3.7=h9c3ff4c_0
  - libvpx=1.7.0=h439df22_0
  - llvm-openmp=14.0.6=h9e868ea_0
  - mpg123=1.31.3=hcb278e6_0
  - ncurses=6.4=h6a678d5_0
  - nettle=3.6=he412f7d_0
  - numpy=1.24.4=py38h59b608b_0
  - openh264=2.1.1=h4ff587b_0
  - openssl=3.1.1=hd590300_1
  - pip=23.1.2=py38h06a4308_0
  - portaudio=19.6.0=h0e77e87_8
  - pycparser=2.21=pyhd8ed1ab_0
  - pysoundfile=0.12.1=pyhd8ed1ab_0
  - python=3.8.17=h955ad1f_0
  - python_abi=3.8=2_cp38
  - readline=8.2=h5eee18b_0
  - setuptools=67.8.0=py38h06a4308_0
  - sqlite=3.41.2=h5eee18b_0
  - tk=8.6.12=h1ccaba5_0
  - wheel=0.38.4=py38h06a4308_0
  - x264=1!157.20191217=h7b6447c_0
  - xz=5.4.2=h5eee18b_0
  - zlib=1.2.13=h5eee18b_0
  - pip:
      - absl-py==1.4.0
      - aiohttp==3.8.4
      - aiosignal==1.3.1
      - alembic==1.11.1
      - antlr4-python3-runtime==4.9.3
      - asteroid-filterbanks==0.4.0
      - async-timeout==4.0.2
      - attrs==23.1.0
      - audioread==3.0.0
      - backports-cached-property==1.0.2
      - cachetools==5.3.1
      - certifi==2023.5.7
      - charset-normalizer==3.2.0
      - click==8.1.4
      - cmaes==0.9.1
      - colorama==0.4.6
      - colorlog==6.7.0
      - contourpy==1.1.0
      - cycler==0.11.0
      - decorator==5.1.1
      - diart==0.7.0
      - docopt==0.6.2
      - einops==0.3.2
      - filelock==3.12.2
      - fonttools==4.41.0
      - frozenlist==1.4.0
      - fsspec==2023.6.0
      - google-auth==2.22.0
      - google-auth-oauthlib==1.0.0
      - greenlet==2.0.2
      - grpcio==1.56.0
      - hmmlearn==0.2.8
      - huggingface-hub==0.16.4
      - hyperpyyaml==1.2.1
      - idna==3.4
      - importlib-metadata==6.8.0
      - importlib-resources==6.0.0
      - joblib==1.3.1
      - julius==0.2.7
      - kiwisolver==1.4.4
      - librosa==0.9.2
      - llvmlite==0.40.1
      - mako==1.2.4
      - markdown==3.4.3
      - markdown-it-py==3.0.0
      - markupsafe==2.1.3
      - matplotlib==3.7.2
      - mdurl==0.1.2
      - mpmath==1.3.0
      - multidict==6.0.4
      - networkx==2.8.8
      - numba==0.57.1
      - nvidia-cublas-cu11==11.10.3.66
      - nvidia-cuda-nvrtc-cu11==11.7.99
      - nvidia-cuda-runtime-cu11==11.7.99
      - nvidia-cudnn-cu11==8.5.0.96
      - oauthlib==3.2.2
      - omegaconf==2.3.0
      - optuna==3.2.0
      - packaging==23.1
      - pandas==2.0.3
      - pillow==10.0.0
      - platformdirs==3.8.1
      - pooch==1.7.0
      - primepy==1.3
      - protobuf==3.20.1
      - pyannote-audio==2.1.1
      - pyannote-core==4.5
      - pyannote-database==4.1.3
      - pyannote-metrics==3.2.1
      - pyannote-pipeline==2.3
      - pyasn1==0.5.0
      - pyasn1-modules==0.3.0
      - pydeprecate==0.3.2
      - pygments==2.15.1
      - pyparsing==3.0.9
      - python-dateutil==2.8.2
      - pytorch-lightning==1.6.5
      - pytorch-metric-learning==1.7.3
      - pytz==2023.3
      - pyyaml==6.0
      - requests==2.31.0
      - requests-oauthlib==1.3.1
      - resampy==0.4.2
      - rich==13.4.2
      - rsa==4.9
      - ruamel-yaml==0.17.28
      - ruamel-yaml-clib==0.2.7
      - rx==3.2.0
      - scikit-learn==1.3.0
      - scipy==1.10.1
      - semver==2.13.0
      - sentencepiece==0.1.99
      - shellingham==1.5.0.post1
      - simplejson==3.19.1
      - singledispatchmethod==1.0
      - six==1.16.0
      - sortedcontainers==2.4.0
      - sounddevice==0.4.6
      - soundfile==0.10.3.post1
      - speechbrain==0.5.14
      - sqlalchemy==2.0.18
      - sympy==1.12
      - tabulate==0.9.0
      - tensorboard==2.13.0
      - tensorboard-data-server==0.7.1
      - threadpoolctl==3.1.0
      - torch==1.13.1
      - torch-audiomentations==0.11.0
      - torch-pitch-shift==1.2.4
      - torchaudio==0.13.1
      - torchmetrics==0.11.4
      - torchvision==0.14.1
      - tqdm==4.65.0
      - typer==0.9.0
      - typing-extensions==4.7.1
      - tzdata==2023.3
      - urllib3==1.26.16
      - websocket-client==1.6.1
      - websocket-server==0.6.4
      - werkzeug==2.3.6
      - yarl==1.9.2
      - zipp==3.16.0
juanmc2005 commented 1 year ago

I confirm this can be solved with ffmpeg<4.4. A temporary fix would be to change the installation instructions:

- conda install portaudio pysoundfile ffmpeg -c conda-forge
+ conda install portaudio=19.6.0 pysoundfile=0.12.1 ffmpeg=4.3 -c conda-forge
sneakers-the-rat commented 1 year ago

Why not make a conda environment file? that's the standard way of specifying conda environments.

see: https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#exporting-an-environment-file-across-platforms

It would be as simple as creating a file environment.yml, eg. this replicates the environment created via the command line:

name: diart
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.8
  - portaudio=19.6.*
  - pysoundfile=0.12.*
  - ffmpeg[version='<4.4']
  - pip
  - pip:
    - .

or with

- pip:
  - diart

so then conda env export

name: diart
channels:
  - conda-forge
  - defaults
dependencies:
  - bzip2=1.0.8=h0d85af4_4
  - ca-certificates=2023.5.7=h8857fd0_0
  - cffi=1.15.1=py38hb368cf1_3
  - ffmpeg=4.3.2=hbf27d7b_3
  - freetype=2.12.1=h3f81eb7_1
  - gettext=0.21.1=h8a4c099_0
  - gmp=6.2.1=h2e338ed_0
  - gnutls=3.6.13=h756fd2b_1
  - lame=3.100=hb7f2c08_1003
  - libblas=3.9.0=17_osx64_openblas
  - libcblas=3.9.0=17_osx64_openblas
  - libcxx=16.0.6=hd57cbcb_0
  - libffi=3.4.2=h0d85af4_5
  - libflac=1.4.3=he965462_0
  - libgfortran=5.0.0=11_3_0_h97931a8_31
  - libgfortran5=12.2.0=he409387_31
  - libiconv=1.17=hac89ed1_0
  - liblapack=3.9.0=17_osx64_openblas
  - libogg=1.3.4=h35c211d_1
  - libopenblas=0.3.23=openmp_h429af6e_0
  - libopus=1.3.1=hc929b4f_1
  - libpng=1.6.39=ha978bb4_0
  - libsndfile=1.2.0=h591af1c_0
  - libsqlite=3.42.0=h58db7d2_0
  - libvorbis=1.3.7=h046ec9c_0
  - libzlib=1.2.13=h8a1eda9_5
  - llvm-openmp=16.0.6=hff08bdf_0
  - mpg123=1.31.3=hf0c8a7f_0
  - ncurses=6.4=hf0c8a7f_0
  - nettle=3.6=hedd7734_0
  - numpy=1.24.4=py38h9a4a08f_0
  - openh264=2.1.1=hfd3ada9_0
  - openssl=3.1.1=h8a1eda9_1
  - pip=23.2=pyhd8ed1ab_0
  - portaudio=19.6.0=he965462_8
  - pycparser=2.21=pyhd8ed1ab_0
  - python=3.8.17=hf9b03c3_0_cpython
  - python_abi=3.8=3_cp38
  - readline=8.2=h9e318b2_1
  - setuptools=68.0.0=pyhd8ed1ab_0
  - tk=8.6.12=h5dbffcc_0
  - wheel=0.40.0=pyhd8ed1ab_1
  - x264=1!161.3030=h0d85af4_1
  - xz=5.2.6=h775f41a_0
  - zlib=1.2.13=h8a1eda9_5
  - pip:
      - absl-py==1.4.0
      - aiohttp==3.8.4
      - aiosignal==1.3.1
      - alembic==1.11.1
      - antlr4-python3-runtime==4.9.3
      - asteroid-filterbanks==0.4.0
      - async-timeout==4.0.2
      - attrs==23.1.0
      - audioread==3.0.0
      - backports-cached-property==1.0.2
      - cachetools==5.3.1
      - certifi==2023.5.7
      - charset-normalizer==3.2.0
      - click==8.1.5
      - cmaes==0.9.1
      - colorama==0.4.6
      - colorlog==6.7.0
      - contourpy==1.1.0
      - cycler==0.11.0
      - decorator==5.1.1
      - diart==0.7.0
      - docopt==0.6.2
      - einops==0.3.2
      - filelock==3.12.2
      - fonttools==4.41.0
      - frozenlist==1.4.0
      - fsspec==2023.6.0
      - google-auth==2.22.0
      - google-auth-oauthlib==1.0.0
      - greenlet==2.0.2
      - grpcio==1.56.0
      - hmmlearn==0.2.8
      - huggingface-hub==0.16.4
      - hyperpyyaml==1.2.1
      - idna==3.4
      - importlib-metadata==6.8.0
      - importlib-resources==6.0.0
      - joblib==1.3.1
      - julius==0.2.7
      - kiwisolver==1.4.4
      - librosa==0.9.2
      - llvmlite==0.40.1
      - mako==1.2.4
      - markdown==3.4.3
      - markdown-it-py==3.0.0
      - markupsafe==2.1.3
      - matplotlib==3.7.2
      - mdurl==0.1.2
      - mpmath==1.3.0
      - multidict==6.0.4
      - networkx==2.8.8
      - numba==0.57.1
      - oauthlib==3.2.2
      - omegaconf==2.3.0
      - optuna==3.2.0
      - packaging==23.1
      - pandas==2.0.3
      - pillow==10.0.0
      - platformdirs==3.9.1
      - pooch==1.7.0
      - primepy==1.3
      - protobuf==3.20.1
      - pyannote-audio==2.1.1
      - pyannote-core==4.5
      - pyannote-database==4.1.3
      - pyannote-metrics==3.2.1
      - pyannote-pipeline==2.3
      - pyasn1==0.5.0
      - pyasn1-modules==0.3.0
      - pydeprecate==0.3.2
      - pygments==2.15.1
      - pyparsing==3.0.9
      - python-dateutil==2.8.2
      - pytorch-lightning==1.6.5
      - pytorch-metric-learning==1.7.3
      - pytz==2023.3
      - pyyaml==6.0
      - requests==2.31.0
      - requests-oauthlib==1.3.1
      - resampy==0.4.2
      - rich==13.4.2
      - rsa==4.9
      - ruamel-yaml==0.17.28
      - ruamel-yaml-clib==0.2.7
      - rx==3.2.0
      - scikit-learn==1.3.0
      - scipy==1.10.1
      - semver==2.13.0
      - sentencepiece==0.1.99
      - shellingham==1.5.0.post1
      - simplejson==3.19.1
      - singledispatchmethod==1.0
      - six==1.16.0
      - sortedcontainers==2.4.0
      - sounddevice==0.4.6
      - soundfile==0.10.3.post1
      - speechbrain==0.5.14
      - sqlalchemy==2.0.19
      - sympy==1.12
      - tabulate==0.9.0
      - tensorboard==2.13.0
      - tensorboard-data-server==0.7.1
      - threadpoolctl==3.2.0
      - torch==1.13.1
      - torch-audiomentations==0.11.0
      - torch-pitch-shift==1.2.4
      - torchaudio==0.13.1
      - torchmetrics==0.11.4
      - torchvision==0.14.1
      - tqdm==4.65.0
      - typer==0.9.0
      - typing-extensions==4.7.1
      - tzdata==2023.3
      - urllib3==1.26.16
      - websocket-client==1.6.1
      - websocket-server==0.6.4
      - werkzeug==2.3.6
      - yarl==1.9.2
      - zipp==3.16.2

which would then give you a single source of truth for installs for eg. CI and tests (rather than needing to keep the install instructions synchronized in the README and the CI.

the ideal way to package with conda is to use its packaging format: https://docs.conda.io/projects/conda-build/en/latest/resources/package-spec.html

so one would be able to install it with conda install diart and other downstream packages would be able to depend on it, but let's call that a "future direction" for now :)

Also, sorry i am stalled out on the review, very busy at work, but i haven't forgotten. I like to be able to take time with the package and help out where I can.

juanmc2005 commented 1 year ago

I agree, the environment.yml file is the correct way to do this. I'm just pointing out a solution for people coming to this issue with the ffmpeg error until I have the time to make the required changes.

juanmc2005 commented 1 year ago

Conda environment.yml added to develop. It will appear in the next release (v0.8)