prefix-dev / pixi

Package management made easy
https://pixi.sh
BSD 3-Clause "New" or "Revised" License
2.94k stars 162 forks source link

Building PIP packages with PyTorch/Cuda dependencies #2033

Open jdumas opened 1 week ago

jdumas commented 1 week ago

Checks

Reproducible example

[project]
authors = ["Name"]
channels = ["nvidia/label/cuda-12.4.0", "nvidia", "conda-forge", "pytorch"]
description = "Add a short description here"
name = "sandbox"
platforms = ["win-64"]
version = "0.1.0"

[dependencies]
python = ">=3.8,<3.11"
pip = ">=24.0,<25"
cuda = "*"
pytorch = ">=2.4.1,<2.5"
pytorch-cuda = "*"
torchvision = "*"
ninja = "*"

[pypi-options]
no-build-isolation = ["diff-surfel-rasterization"]

[pypi-dependencies]
# This requires CUDA_HOME to be set, so this step fails when running `pixi shell`
diff-surfel-rasterization = { git = "https://github.com/hbb1/diff-surfel-rasterization.git" }

[tasks]
# I find that running `pixi run surfel-install` I am able to install the package
# surfel-install = "pip install git+https://github.com/hbb1/diff-surfel-rasterization.git@7bdbd51"

Issue description

Hi. Consider the provided pixi.toml file. I find that I cannot build the environment on my Windows machine. Specifically it fails to build the pip package from source, with the following error message:

❯ pixi shell
  ⠈ default:win-64       [00:00:06]
  × failed to solve the pypi requirements of 'default' 'win-64'
  ├─▶ failed to resolve pypi dependencies
  ├─▶ Failed to download and build `diff-surfel-rasterization @ git+https://github.com/hbb1/diff-surfel-rasterization.git`
  ╰─▶ Build backend failed to determine metadata through `prepare_metadata_for_build_wheel` with exit code: 1
      --- stdout:

      --- stderr:
      Traceback (most recent call last):
        File "<string>", line 14, in <module>
        File "C:\devel\test-pixi\.pixi\envs\default\lib\site-packages\setuptools\build_meta.py", line 373, in prepare_metadata_for_build_wheel
          self.run_setup()
        File "C:\devel\test-pixi\.pixi\envs\default\lib\site-packages\setuptools\build_meta.py", line 502, in run_setup
          super().run_setup(setup_script=setup_script)
        File "C:\devel\test-pixi\.pixi\envs\default\lib\site-packages\setuptools\build_meta.py", line 318, in run_setup
          exec(code, locals())
        File "<string>", line 22, in <module>
        File "C:\devel\test-pixi\.pixi\envs\default\lib\site-packages\torch\utils\cpp_extension.py", line 1076, in CUDAExtension
          library_dirs += library_paths(cuda=True)
        File "C:\devel\test-pixi\.pixi\envs\default\lib\site-packages\torch\utils\cpp_extension.py", line 1214, in library_paths
          paths.append(_join_cuda_home(lib_dir))
        File "C:\devel\test-pixi\.pixi\envs\default\lib\site-packages\torch\utils\cpp_extension.py", line 2416, in _join_cuda_home
          raise OSError('CUDA_HOME environment variable is not set. '
      OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

I am on pixi 0.29.0. I find that if I run the pip install step as separate command (via pixi run surfel-install), then I can build the CUDA kernels as expected. I would appreciate any guidance you have on this particular issue. I couldn't find any other issue related to a missing CUDA_HOME env variable. The closest I could find is this conda issue (https://github.com/conda/conda/issues/7757). Since I'm able to build by running pixi run surfel-install, it seems it's a bug/missing feature in pixi but let me know if I'm wrong.

Expected behavior

Can build provided pixi.toml file.

tdejager commented 6 days ago

Actually I would've expected the env variable to be set by the CUDA activation script.

These should be passed to the PyPI build. @baszalmstra do you know something about the 'CUDA_HOME' env variable?

baszalmstra commented 6 days ago

Im also pretty sure this should be aet by the activation script. Could you check by going into the shell and echoing it?

@tdejager do we actually run the activation script and pass those variables to uv?

tdejager commented 6 days ago

We do pass env variables to uv for sure, otherwise things like c compilers wouldn't work also. And I believe we use the 'activator' for that.

jdumas commented 6 days ago

If I remove the surfel package, after entering the pixi shell running echo $env:CUDA_HOME shows as empty. I have no idea how the manual pip install is able to succeed. Do you think it's a problem with uv itself?

EDIT: I've opened https://github.com/astral-sh/uv/issues/7299 after trying to build this dependency with uv directly.