pyg-team / pyg-lib

Low-Level Graph Neural Network Operators for PyG
https://pyg-lib.readthedocs.io
Other
173 stars 42 forks source link

Build failed on ppc64 #354

Open dmagee opened 2 months ago

dmagee commented 2 months ago

šŸ˜µ Describe the installation problem

Building from the master on powerpc64 Iget the following error from:

$ pip  install git+https://github.com/pyg-team/pyg-lib.git --config-settings="INCLUDE=/path/to/python/cuda/include"
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting git+https://github.com/pyg-team/pyg-lib.git
  Cloning https://github.com/pyg-team/pyg-lib.git to /tmp/pip-req-build-5maim7ab
  Running command git clone --filter=blob:none --quiet https://github.com/pyg-team/pyg-lib.git /tmp/pip-req-build-5maim7ab
  Resolved https://github.com/pyg-team/pyg-lib.git to commit 8f819642680295057ecab9e8b516eaf29942782a
  Running command git submodule update --init --recursive -q
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: pyg_lib
  Building wheel for pyg_lib (pyproject.toml) ... error
  error: subprocess-exited-with-error

  Ɨ Building wheel for pyg_lib (pyproject.toml) did not run successfully.
  ā”‚ exit code: 1
  ā•°ā”€> [75 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build/lib.linux-ppc64le-cpython-311/pyg_lib
      copying pyg_lib/__init__.py -> build/lib.linux-ppc64le-cpython-311/pyg_lib
      copying pyg_lib/_triton.py -> build/lib.linux-ppc64le-cpython-311/pyg_lib
      copying pyg_lib/home.py -> build/lib.linux-ppc64le-cpython-311/pyg_lib
      copying pyg_lib/testing.py -> build/lib.linux-ppc64le-cpython-311/pyg_lib
      creating build/lib.linux-ppc64le-cpython-311/pyg_lib/ops
      copying pyg_lib/ops/__init__.py -> build/lib.linux-ppc64le-cpython-311/pyg_lib/ops
      copying pyg_lib/ops/scatter_reduce.py -> build/lib.linux-ppc64le-cpython-311/pyg_lib/ops
      creating build/lib.linux-ppc64le-cpython-311/pyg_lib/partition
      copying pyg_lib/partition/__init__.py -> build/lib.linux-ppc64le-cpython-311/pyg_lib/partition
      creating build/lib.linux-ppc64le-cpython-311/pyg_lib/sampler
      copying pyg_lib/sampler/__init__.py -> build/lib.linux-ppc64le-cpython-311/pyg_lib/sampler
      running build_ext
      Traceback (most recent call last):
        File "/nobackup/projects/bdlds05/cscdrm/env/miniconda/envs/joe/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/nobackup/projects/bdlds05/cscdrm/env/miniconda/envs/joe/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/nobackup/projects/bdlds05/cscdrm/env/miniconda/envs/joe/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
          return _build_backend().build_wheel(wheel_directory, config_settings,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 421, in build_wheel
          return self._build_with_temp_dir(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 403, in _build_with_temp_dir
          self.run_setup()
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 503, in run_setup
          super().run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 318, in run_setup
          exec(code, locals())
        File "<string>", line 133, in <module>
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/__init__.py", line 117, in setup
          return distutils.core.setup(**attrs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 183, in setup
          return run_commands(dist)
                 ^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 199, in run_commands
          dist.run_commands()
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 954, in run_commands
          self.run_command(cmd)
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 950, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/command/bdist_wheel.py", line 398, in run
          self.run_command("build")
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 950, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/build.py", line 135, in run
          self.run_command(cmd_name)
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 950, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 98, in run
          _build_ext.run(self)
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
          self.build_extensions()
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 476, in build_extensions
          self._build_extensions_serial()
        File "/tmp/pip-build-env-0q6a203a/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 502, in _build_extensions_serial
          self.build_extension(ext)
        File "<string>", line 42, in build_extension
      ModuleNotFoundError: No module named 'torch'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for pyg_lib
Failed to build pyg_lib
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (pyg_lib)

Note: I have pytorch installed in this conda virtual environment:

$ pip list
Package            Version
------------------ --------------
aiohappyeyeballs   2.4.3
aiohttp            3.10.8
aiosignal          1.3.1
attrs              24.2.0
av                 10.0.0
Brotli             1.0.9
certifi            2020.6.20
cffi               1.15.1
chardet            3.0.4
charset-normalizer 3.3.2
click              8.1.7
cryptography       41.0.7
cuda-python        12.2.0
Cython             3.0.11
docker             4.3.1
docker-pycreds     0.4.0
filelock           3.9.0
frozenlist         1.4.1
fsspec             2023.10.0
gmpy2              2.1.2
idna               2.10
Jinja2             3.1.2
MarkupSafe         2.1.1
mpmath             1.3.0
multidict          6.1.0
networkx           2.8.4
ninja              1.11.1.1
numpy              1.26.0
nvidia-pyindex     1.0.9
nvidia-tlt         0.1.21
Pillow             10.0.1
pip                24.2
protobuf           4.21.12
psutil             6.0.0
pycparser          2.21
pyOpenSSL          23.2.0
pyparsing          3.1.4
PySocks            1.7.1
PyYAML             6.0.1
requests           2.24.0
scipy              1.11.3
sentencepiece      0.1.99
setuptools         75.1.0
six                1.15.0
sympy              1.11.1
tabulate           0.8.7
torch              2.1.2
torch-cluster      1.6.3
torch-geometric    2.6.1
torch-scatter      2.1.2
torch-sparse       0.6.18
torch-spline-conv  1.2.2
torchdata          0.7.1+5e6f7b7
torchtext          0.16.2+1fc66c9
torchvision        0.16.2
tqdm               4.65.0
typing_extensions  4.7.1
urllib3            1.25.10
websocket-client   0.57.0
wheel              0.41.2
yarl               1.13.1

Environment

I had to use option: --config-settings="INCLUDE=/path/to/python/cuda/include" as default include used was not where cuda was installed. Cuda was installed by "pip install cuda-python==12.2"

akihironitta commented 2 months ago

I think building from source requires torch to be installed in advance as I see:

      ModuleNotFoundError: No module named 'torch'

Have you had a chance to try installing torch first before building pyg-lib?

dmagee commented 2 months ago

Torch is installed !!!

"Note: I have pytorch installed in this conda virtual environment:"

torch 2.1.2

dmagee commented 1 month ago

After a bit of googling I think the issue is pip creates a new virtual environment in /tmp with just the specified dependencies, rather than using the selected virtual environment (unless the --no-build-isolation option is used). Unfortunately when using the --no-build-isolation option the fix for cuda headers location (--config-settings="INCLUDE=/actual/path/to/python/cuda/include") no longer works and it fails with error:

cc1plus: fatal error: cuda_runtime.h: No such file or directory

Thus I think there are two issues here:

1) pyg-lib isn't specifying torch as a dependency and so it's not included in the virtual environment created in /tmp

2) The build expects cuda headers to be in (or at least this is the only Include directory used by gcc in the build):

path-to_my_virtual_environment/bin/../targets/ppc64le-linux/include i.e path-to_my_virtual_environment/targets/ppc64le-linux/include

Which doesn't exist. The header in question is actually in:

path-to_my_virtual_environment/pkgs/cuda-toolkit/targets/ppc64le-linux/include

Cuda was installed into the virtual environment via pip:

pip install cuda-python==12.2

Any thoughts on fixing either of these issues?