Using the environment.yml with unspecified numpy version causes it to pull in Numpy 2.1.0 as of today's date. This causes an issue with the specified version of deepspeed.
Specifying Numpy to version 1.26 appears to get past the installation error, but I cannot confirm any other regressions.
Using cached deepspeed-0.12.4.tar.gz (1.2 MB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [34 lines of output]
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.0 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-ukaj6i9k/deepspeed_2c7096c05c1647b692f817716c6fb3f3/setup.py", line 31, in <module>
import torch
File "/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/__init__.py", line 1382, in <module>
from .functional import * # noqa: F403
File "/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/functional.py", line 7, in <module>
import torch.nn.functional as F
File "/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/nn/__init__.py", line 1, in <module>
from .modules import * # noqa: F403
File "/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
from .transformer import TransformerEncoder, TransformerDecoder, \
File "/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /opt/conda/conda-bld/pytorch_1702400410390/work/torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-ukaj6i9k/deepspeed_2c7096c05c1647b692f817716c6fb3f3/setup.py", line 100, in <module>
cuda_major_ver, cuda_minor_ver = installed_cuda_version()
File "/tmp/pip-install-ukaj6i9k/deepspeed_2c7096c05c1647b692f817716c6fb3f3/op_builder/builder.py", line 50, in installed_cuda_version
raise MissingCUDAException("CUDA_HOME does not exist, unable to compile CUDA op(s)")
op_builder.builder.MissingCUDAException: CUDA_HOME does not exist, unable to compile CUDA op(s)
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
critical libmamba pip failed to install packages```
Using the environment.yml with unspecified numpy version causes it to pull in Numpy 2.1.0 as of today's date. This causes an issue with the specified version of deepspeed.
Specifying Numpy to version 1.26 appears to get past the installation error, but I cannot confirm any other regressions.
https://github.com/aqlaboratory/openfold/blob/3bec3e9b2d1e8bdb83887899102eff7d42dc2ba9/environment.yml#L16