huggingface / optimum-nvidia

Apache License 2.0
844 stars 83 forks source link

Unable to install `optimum-nvidia` on Ubuntu #144

Open QuantumStaticFR opened 2 weeks ago

QuantumStaticFR commented 2 weeks ago

Platform

Ubuntu: 24.04 LTS Python: 3.8/3.10/3.12

Steps

Given Installation Instructions

  1. apt-get update && apt-get -y install python3.10 python3-pip openmpi-bin libopenmpi-dev
  2. python -m pip install --pre --extra-index-url https://pypi.nvidia.com optimum-nvidia

    Steps I tried apart from given instructions

  3. Created a fresh python virtual environment python3.x -m venv myvenv and activated it
  4. First run pip install optimum
  5. Then try running pip install optimum-nvidia

Error on given installation guide

Building wheels for collected packages: mpi4py
  Building wheel for mpi4py (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for mpi4py (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [149 lines of output]
      running bdist_wheel
      running build
      running build_src
      running build_py
      creating build
      creating build/lib.linux-x86_64-cpython-310
      creating build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/bench.py -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/__main__.py -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/__init__.py -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/run.py -> build/lib.linux-x86_64-cpython-310/mpi4py
      creating build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/_base.py -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/_lib.py -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/__main__.py -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/_core.py -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/aplus.py -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/__init__.py -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/server.py -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/pool.py -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      creating build/lib.linux-x86_64-cpython-310/mpi4py/util
      copying src/mpi4py/util/pkl5.py -> build/lib.linux-x86_64-cpython-310/mpi4py/util
      copying src/mpi4py/util/dtlib.py -> build/lib.linux-x86_64-cpython-310/mpi4py/util
      copying src/mpi4py/util/__init__.py -> build/lib.linux-x86_64-cpython-310/mpi4py/util
      copying src/mpi4py/bench.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/__init__.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/dl.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/run.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/__main__.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/MPI.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/py.typed -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/__init__.pxd -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/libmpi.pxd -> build/lib.linux-x86_64-cpython-310/mpi4py
      copying src/mpi4py/MPI.pxd -> build/lib.linux-x86_64-cpython-310/mpi4py
      creating build/lib.linux-x86_64-cpython-310/mpi4py/include
      creating build/lib.linux-x86_64-cpython-310/mpi4py/include/mpi4py
      copying src/mpi4py/include/mpi4py/mpi4py.MPI_api.h -> build/lib.linux-x86_64-cpython-310/mpi4py/include/mpi4py
      copying src/mpi4py/include/mpi4py/mpi4py.MPI.h -> build/lib.linux-x86_64-cpython-310/mpi4py/include/mpi4py
      copying src/mpi4py/include/mpi4py/mpi4py.h -> build/lib.linux-x86_64-cpython-310/mpi4py/include/mpi4py
      copying src/mpi4py/include/mpi4py/mpi4py.i -> build/lib.linux-x86_64-cpython-310/mpi4py/include/mpi4py
      copying src/mpi4py/include/mpi4py/mpi.pxi -> build/lib.linux-x86_64-cpython-310/mpi4py/include/mpi4py
      copying src/mpi4py/futures/__init__.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/_core.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/aplus.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/server.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/__main__.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/pool.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/futures/_lib.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py/futures
      copying src/mpi4py/util/__init__.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py/util
      copying src/mpi4py/util/pkl5.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py/util
      copying src/mpi4py/util/dtlib.pyi -> build/lib.linux-x86_64-cpython-310/mpi4py/util
      running build_clib
      MPI configuration: [mpi] from 'mpi.cfg'
      <string>:135: DeprecationWarning: Use shutil.which instead of find_executable
      checking for library 'lmpe' ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
      gcc-9 _configtest.o -llmpe -o _configtest
      /usr/bin/ld: cannot find -llmpe: No such file or directory
      collect2: error: ld returned 1 exit status
      failure.
      removing: _configtest.c _configtest.o
      building 'mpe' dylib library
      creating build/temp.linux-x86_64-cpython-310
      creating build/temp.linux-x86_64-cpython-310/src
      creating build/temp.linux-x86_64-cpython-310/src/lib-pmpi
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c src/lib-pmpi/mpe.c -o build/temp.linux-x86_64-cpython-310/src/lib-pmpi/mpe.o
      creating build/lib.linux-x86_64-cpython-310/mpi4py/lib-pmpi
      gcc-9 -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 -Wl,--no-as-needed build/temp.linux-x86_64-cpython-310/src/lib-pmpi/mpe.o -o build/lib.linux-x86_64-cpython-310/mpi4py/lib-pmpi/libmpe.so
      checking for library 'vt-mpi' ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
      gcc-9 _configtest.o -lvt-mpi -o _configtest
      /usr/bin/ld: cannot find -lvt-mpi: No such file or directory
      collect2: error: ld returned 1 exit status
      failure.
      removing: _configtest.c _configtest.o
      checking for library 'vt.mpi' ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
      gcc-9 _configtest.o -lvt.mpi -o _configtest
      /usr/bin/ld: cannot find -lvt.mpi: No such file or directory
      collect2: error: ld returned 1 exit status
      failure.
      removing: _configtest.c _configtest.o
      building 'vt' dylib library
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c src/lib-pmpi/vt.c -o build/temp.linux-x86_64-cpython-310/src/lib-pmpi/vt.o
      gcc-9 -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 -Wl,--no-as-needed build/temp.linux-x86_64-cpython-310/src/lib-pmpi/vt.o -o build/lib.linux-x86_64-cpython-310/mpi4py/lib-pmpi/libvt.so
      checking for library 'vt-mpi' ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
      gcc-9 _configtest.o -lvt-mpi -o _configtest
      /usr/bin/ld: cannot find -lvt-mpi: No such file or directory
      collect2: error: ld returned 1 exit status
      failure.
      removing: _configtest.c _configtest.o
      checking for library 'vt.mpi' ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
      gcc-9 _configtest.o -lvt.mpi -o _configtest
      /usr/bin/ld: cannot find -lvt.mpi: No such file or directory
      collect2: error: ld returned 1 exit status
      failure.
      removing: _configtest.c _configtest.o
      building 'vt-mpi' dylib library
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c src/lib-pmpi/vt-mpi.c -o build/temp.linux-x86_64-cpython-310/src/lib-pmpi/vt-mpi.o
      gcc-9 -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 -Wl,--no-as-needed build/temp.linux-x86_64-cpython-310/src/lib-pmpi/vt-mpi.o -o build/lib.linux-x86_64-cpython-310/mpi4py/lib-pmpi/libvt-mpi.so
      checking for library 'vt-hyb' ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
      gcc-9 _configtest.o -lvt-hyb -o _configtest
      /usr/bin/ld: cannot find -lvt-hyb: No such file or directory
      collect2: error: ld returned 1 exit status
      failure.
      removing: _configtest.c _configtest.o
      checking for library 'vt.ompi' ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
      gcc-9 _configtest.o -lvt.ompi -o _configtest
      /usr/bin/ld: cannot find -lvt.ompi: No such file or directory
      collect2: error: ld returned 1 exit status
      failure.
      removing: _configtest.c _configtest.o
      building 'vt-hyb' dylib library
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -c src/lib-pmpi/vt-hyb.c -o build/temp.linux-x86_64-cpython-310/src/lib-pmpi/vt-hyb.o
      gcc-9 -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 -Wl,--no-as-needed build/temp.linux-x86_64-cpython-310/src/lib-pmpi/vt-hyb.o -o build/lib.linux-x86_64-cpython-310/mpi4py/lib-pmpi/libvt-hyb.so
      running build_ext
      MPI configuration: [mpi] from 'mpi.cfg'
      checking for dlopen() availability ...
      checking for header 'dlfcn.h' ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -I/home/fileread/Desktop/Utkarsh/ML/optimum/optimumvenv/include -I/usr/include/python3.10 -c _configtest.c -o _configtest.o
      success!
      removing: _configtest.c _configtest.o
      success!
      checking for library 'dl' ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -I/home/fileread/Desktop/Utkarsh/ML/optimum/optimumvenv/include -I/usr/include/python3.10 -c _configtest.c -o _configtest.o
      gcc-9 _configtest.o -L/usr/lib/x86_64-linux-gnu -Lbuild/temp.linux-x86_64-cpython-310 -Wl,--enable-new-dtags,-rpath,/usr/lib/x86_64-linux-gnu -ldl -o _configtest
      success!
      removing: _configtest.c _configtest.o _configtest
      checking for function 'dlopen' ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -I/home/fileread/Desktop/Utkarsh/ML/optimum/optimumvenv/include -I/usr/include/python3.10 -c _configtest.c -o _configtest.o
      gcc-9 _configtest.o -L/usr/lib/x86_64-linux-gnu -Lbuild/temp.linux-x86_64-cpython-310 -Wl,--enable-new-dtags,-rpath,/usr/lib/x86_64-linux-gnu -ldl -o _configtest
      success!
      removing: _configtest.c _configtest.o _configtest
      building 'mpi4py.dl' extension
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -DHAVE_DLFCN_H=1 -DHAVE_DLOPEN=1 -I/home/fileread/Desktop/Utkarsh/ML/optimum/optimumvenv/include -I/usr/include/python3.10 -c src/dynload.c -o build/temp.linux-x86_64-cpython-310/src/dynload.o
      gcc-9 -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 build/temp.linux-x86_64-cpython-310/src/dynload.o -L/usr/lib/x86_64-linux-gnu -Lbuild/temp.linux-x86_64-cpython-310 -Wl,--enable-new-dtags,-rpath,/usr/lib/x86_64-linux-gnu -ldl -o build/lib.linux-x86_64-cpython-310/mpi4py/dl.cpython-310-x86_64-linux-gnu.so
      checking for MPI compile and link ...
      gcc-9 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -g -fwrapv -O2 -fPIC -I/home/fileread/Desktop/Utkarsh/ML/optimum/optimumvenv/include -I/usr/include/python3.10 -c _configtest.c -o _configtest.o
      _configtest.c:2:10: fatal error: mpi.h: No such file or directory
          2 | #include <mpi.h>
            |          ^~~~~~~
      compilation terminated.
      failure.
      removing: _configtest.c _configtest.o
      error: Cannot compile MPI programs. Check your configuration!!!
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for mpi4py
Failed to build mpi4py
ERROR: Could not build wheels for mpi4py, which is required to install pyproject.toml-based projects

Errors on different python versions using my steps

Python 3.8

ERROR: Ignored the following versions that require a different python version: 0.1.0b5 Requires-Python >=3.10; 0.1.0b6 Requires-Python >=3.10; 0.1.0b7 Requires-Python >=3.10
ERROR: Could not find a version that satisfies the requirement optimum-nvidia (from versions: none)
ERROR: No matching distribution found for optimum-nvidia

Python 3.10

Collecting nvidia-ammo~=0.7.0
  Downloading nvidia-ammo-0.7.4.tar.gz (6.9 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [6 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-cbuy0b6u/nvidia-ammo_ab45a6d46c164fb680b975db6a9a2ff3/setup.py", line 90, in <module>
          raise RuntimeError("Bad params")
      RuntimeError: Bad params
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

Python 3.12

Collecting tensorrt==9.3.0.post12.dev1 (from tensorrt-llm==0.9.0->optimum-nvidia)
  Downloading tensorrt-9.3.0.post12.dev1.tar.gz (6.9 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [20 lines of output]
      Traceback (most recent call last):
        File "/home/path/to/optimum/optimumvenv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/path/to/optimum/optimumvenv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/path/to/optimum/optimumvenv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-vvzluoxx/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 327, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=[])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-vvzluoxx/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 297, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-vvzluoxx/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 497, in run_setup
          super().run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-vvzluoxx/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 313, in run_setup
          exec(code, locals())
        File "<string>", line 90, in <module>
      RuntimeError: Bad params
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
mfuntowicz commented 2 weeks ago

Thanks @QuantumStaticFR and sorry you are having issue installing optimum-nvidia.

I would recommand forwarding this error to the TensorRT-LLM repository because the error you're facing is due to the transitive dependency on MPI from TensorRT-LLM.

Also on a general note: You should stick to Python3.10 this is the supported Python version for TensorRT-LLM

@laikhtewari for viz on Nvidia side

TanvirKasir786 commented 1 week ago

conda install mpi4py solved the issue for me. Make sure you are using python 3.10. If your environment is already using python 3.10 then use the following: apt-get update && apt-get -y install openmpi-bin libopenmpi-dev conda install mpi4py or python -m pip install mpi4py python -m pip install --pre --extra-index-url https://pypi.nvidia.com optimum-nvidia