aqlaboratory / openfold

Trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold 2
Apache License 2.0
2.76k stars 529 forks source link

installation issue with cuda 12 #494

Open blakemertz opened 1 week ago

blakemertz commented 1 week ago

I have tried several permutations to get openfold to install on my local machine, but no joy up to this point. Could use some help, as I need to install openfold as a dependency for a couple of other codes (in particular DiffDock-L). Here is my GPU, driver, and cuda:

nvidia-smi 
Mon Oct 14 16:54:16 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.06             Driver Version: 535.183.06   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3060 ...    Off | 00000000:01:00.0  On |                  N/A |
| N/A   41C    P8              15W /  80W |     59MiB /  6144MiB |     14%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1382      G   /usr/lib/xorg/Xorg                           55MiB |
+---------------------------------------------------------------------------------------+

My v12 of gcc/g++/gfortran on my OS is 12.4 -- I believe that 12.2 is the highest version supported by cuda 12.1/2, but 12.4 is what is included in my Debian testing repos.

My packages for the openfold environment, pulled from the pl_upgrades branch to be able to utilize pytorch v2 and cuda 12:

conda list
# packages in environment at /media/Data/binaries/miniconda3/envs/openfold:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                  2_kmp_llvm    conda-forge
absl-py                   2.1.0              pyhd8ed1ab_0    conda-forge
annotated-types           0.7.0                    pypi_0    pypi
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
aria2                     1.37.0               hbc8128a_2    conda-forge
aws-c-auth                0.7.26               hc36b679_2    conda-forge
aws-c-cal                 0.7.4                h2abdd08_0    conda-forge
aws-c-common              0.9.27               h4bc722e_0    conda-forge
aws-c-compression         0.2.19               haa50ccc_0    conda-forge
aws-c-event-stream        0.4.3                h570d160_0    conda-forge
aws-c-http                0.8.8                h9b61739_1    conda-forge
aws-c-io                  0.14.18              h49c7fd3_7    conda-forge
aws-c-mqtt                0.10.4              h5c8269d_18    conda-forge
aws-c-s3                  0.6.4               h77088c0_11    conda-forge
aws-c-sdkutils            0.1.19               h038f3f9_2    conda-forge
aws-checksums             0.1.18              h038f3f9_10    conda-forge
awscli                    2.18.3          py310hff52083_0    conda-forge
awscrt                    0.21.2          py310h95a9d59_15    conda-forge
biopython                 1.84            py310hc51659f_0    conda-forge
blas                      2.116                       mkl    conda-forge
blas-devel                3.9.0            16_linux64_mkl    conda-forge
brotli-python             1.1.0           py310hc6cd4ac_1    conda-forge
bzip2                     1.0.8                h4bc722e_7    conda-forge
c-ares                    1.33.1               heb4867d_0    conda-forge
ca-certificates           2024.8.30            hbcca054_0    conda-forge
certifi                   2024.8.30          pyhd8ed1ab_0    conda-forge
cffi                      1.17.0          py310h2fdcea3_0    conda-forge
charset-normalizer        3.4.0              pyhd8ed1ab_0    conda-forge
click                     8.1.7           unix_pyh707e725_0    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
contextlib2               21.6.0             pyhd8ed1ab_0    conda-forge
cryptography              40.0.2          py310h34c0648_0    conda-forge
cuda-cudart               12.1.105                      0    nvidia
cuda-cupti                12.1.105                      0    nvidia
cuda-libraries            12.1.0                        0    nvidia
cuda-nvrtc                12.1.105                      0    nvidia
cuda-nvtx                 12.1.105                      0    nvidia
cuda-opencl               12.4.127                      0    nvidia
cuda-runtime              12.1.0                        0    nvidia
cudatoolkit               11.8.0              h4ba93d1_13    conda-forge
deepspeed                 0.12.4                   pypi_0    pypi
distro                    1.8.0              pyhd8ed1ab_0    conda-forge
dllogger                  1.0.0                    pypi_0    pypi
dm-tree                   0.1.6                    pypi_0    pypi
docker-pycreds            0.4.0                      py_0    conda-forge
docutils                  0.19            py310hff52083_1    conda-forge
einops                    0.8.0                    pypi_0    pypi
fftw                      3.3.10          nompi_hf1063bd_110    conda-forge
filelock                  3.16.1             pyhd8ed1ab_0    conda-forge
flash-attn                2.6.3                    pypi_0    pypi
fsspec                    2024.9.0           pyhff2d567_0    conda-forge
git                       2.46.0          pl5321hb5640b7_0    conda-forge
gitdb                     4.0.11             pyhd8ed1ab_0    conda-forge
gitpython                 3.1.43             pyhd8ed1ab_0    conda-forge
gmp                       6.3.0                hac33072_2    conda-forge
gmpy2                     2.1.5           py310hc7909c9_1    conda-forge
hhsuite                   3.3.0           py310pl5321hc31ed2c_12    bioconda
hjson                     3.1.0                    pypi_0    pypi
hmmer                     3.4                  hdbdd923_2    bioconda
icu                       75.1                 he02047a_0    conda-forge
idna                      3.10               pyhd8ed1ab_0    conda-forge
ihm                       1.3             py310h5b4e0ec_0    conda-forge
jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
jmespath                  1.0.1              pyhd8ed1ab_0    conda-forge
kalign2                   2.04                 h031d066_7    bioconda
keyutils                  1.6.1                h166bdaf_0    conda-forge
krb5                      1.21.3               h659f571_0    conda-forge
ld_impl_linux-64          2.43                 h712a8e2_1    conda-forge
libabseil                 20240116.2      cxx17_he02047a_1    conda-forge
libblas                   3.9.0            16_linux64_mkl    conda-forge
libcblas                  3.9.0            16_linux64_mkl    conda-forge
libcublas                 12.1.0.26                     0    nvidia
libcufft                  11.0.2.4                      0    nvidia
libcufile                 1.9.1.3                       0    nvidia
libcurand                 10.3.5.147                    0    nvidia
libcurl                   8.9.1                hdb1bdb2_0    conda-forge
libcusolver               11.4.4.55                     0    nvidia
libcusparse               12.0.2.55                     0    nvidia
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libexpat                  2.6.2                h59595ed_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc                    7.2.0                h69d50b8_2    conda-forge
libgcc-ng                 14.1.0               h77fa898_0    conda-forge
libgfortran-ng            14.1.0               h69a702a_0    conda-forge
libgfortran5              14.1.0               hc5f4f2c_0    conda-forge
libhwloc                  2.11.1          default_hecaa2ac_1000    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
liblapack                 3.9.0            16_linux64_mkl    conda-forge
liblapacke                3.9.0            16_linux64_mkl    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnpp                    12.0.2.50                     0    nvidia
libnsl                    2.0.1                hd590300_0    conda-forge
libnvjitlink              12.1.105                      0    nvidia
libnvjpeg                 12.1.1.14                     0    nvidia
libprotobuf               4.25.3               h08a7969_0    conda-forge
libsqlite                 3.46.0               hde9e2c9_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-ng              14.1.0               hc0a3c3a_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxml2                   2.12.7               he7c6b58_4    conda-forge
libzlib                   1.3.1                h4ab18f5_1    conda-forge
lightning-utilities       0.11.7             pyhd8ed1ab_0    conda-forge
llvm-openmp               15.0.7               h0cdce71_0    conda-forge
markupsafe                2.1.5           py310h2372a71_0    conda-forge
mkl                       2022.1.0           h84fe81f_915    conda-forge
mkl-devel                 2022.1.0           ha770c72_916    conda-forge
mkl-include               2022.1.0           h84fe81f_915    conda-forge
ml-collections            0.1.1              pyhd8ed1ab_0    conda-forge
modelcif                  0.7                pyhd8ed1ab_0    conda-forge
mpc                       1.3.1                h24ddda3_0    conda-forge
mpfr                      4.2.1                h38ae2d0_2    conda-forge
mpmath                    1.3.0              pyhd8ed1ab_0    conda-forge
msgpack-python            1.0.8           py310h25c7140_0    conda-forge
ncurses                   6.5                  he02047a_1    conda-forge
networkx                  3.3                pyhd8ed1ab_1    conda-forge
ninja                     1.11.1.1                 pypi_0    pypi
numpy                     1.26.0          py310hb13e2d6_0    conda-forge
ocl-icd                   2.3.2                hd590300_1    conda-forge
ocl-icd-system            1.0.0                         1    conda-forge
openmm                    7.7.0           py310hccf1d78_1    conda-forge
openssl                   3.3.1                hb9d3cd8_3    conda-forge
packaging                 24.1               pyhd8ed1ab_0    conda-forge
pandas                    2.2.2           py310hf9f9076_1    conda-forge
pcre2                     10.44                hba22ea6_2    conda-forge
pdbfixer                  1.8.1              pyh6c4a22f_0    conda-forge
perl                      5.32.1          7_hd590300_perl5    conda-forge
pip                       24.2               pyh8b19718_1    conda-forge
platformdirs              4.3.6              pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.38             pyha770c72_0    conda-forge
prompt_toolkit            3.0.38               hd8ed1ab_0    conda-forge
protobuf                  4.25.3          py310ha8c1f0e_0    conda-forge
psutil                    6.0.0           py310hc51659f_0    conda-forge
py-cpuinfo                9.0.0                    pypi_0    pypi
pycparser                 2.22               pyhd8ed1ab_0    conda-forge
pydantic                  2.9.2                    pypi_0    pypi
pydantic-core             2.23.4                   pypi_0    pypi
pynvml                    11.5.3                   pypi_0    pypi
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
python                    3.10.14         hd12c33a_0_cpython    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python-tzdata             2024.2             pyhd8ed1ab_0    conda-forge
python_abi                3.10                    5_cp310    conda-forge
pytorch                   2.1.2           py3.10_cuda12.1_cudnn8.9.2_0    pytorch
pytorch-cuda              12.1                 ha16c6d3_5    pytorch
pytorch-lightning         2.4.0              pyhd8ed1ab_0    conda-forge
pytorch-mutex             1.0                        cuda    pytorch
pytz                      2024.2             pyhd8ed1ab_0    conda-forge
pyyaml                    5.4.1           py310h5764c6d_4    conda-forge
readline                  8.2                  h8228510_1    conda-forge
requests                  2.32.3             pyhd8ed1ab_0    conda-forge
ruamel.yaml               0.17.21         py310h1fa729e_3    conda-forge
ruamel.yaml.clib          0.2.8           py310h2372a71_0    conda-forge
s2n                       1.5.1                h3400bea_0    conda-forge
scipy                     1.14.1          py310ha3fb0e1_0    conda-forge
sentry-sdk                2.16.0             pyhd8ed1ab_0    conda-forge
setproctitle              1.3.3           py310h2372a71_0    conda-forge
setuptools                59.5.0          py310hff52083_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
smmap                     5.0.0              pyhd8ed1ab_0    conda-forge
sympy                     1.13.3          pypyh2585a3b_103    conda-forge
tbb                       2021.12.0            h434a139_3    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
torchmetrics              1.4.2              pyhd8ed1ab_0    conda-forge
torchtriton               2.1.0                     py310    pytorch
tqdm                      4.62.2             pyhd8ed1ab_0    conda-forge
typing-extensions         4.12.2               hd8ed1ab_0    conda-forge
typing_extensions         4.12.2             pyha770c72_0    conda-forge
tzdata                    2024b                hc8b5060_0    conda-forge
urllib3                   1.26.19            pyhd8ed1ab_0    conda-forge
wandb                     0.16.6             pyhd8ed1ab_1    conda-forge
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
wheel                     0.44.0             pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zstd                      1.5.6                ha6fb4c9_0    conda-forge

During installation of 3rd-party dependencies, I get the following output, indicating that the dependencies did not install (setup.py install is part of this process and failed to run):

./scripts/install_third_party_dependencies.sh 
--2024-10-14 16:41:09--  https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b554ec94415f8/modules/mol/alg/src/stereo_chemical_props.txt
Resolving git.scicore.unibas.ch (git.scicore.unibas.ch)... 131.152.229.50
Connecting to git.scicore.unibas.ch (git.scicore.unibas.ch)|131.152.229.50|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9119 (8.9K) [text/plain]
Saving to: ‘openfold/resources/stereo_chemical_props.txt’

stereo_chemical_props.txt                     100%[=================================================================================================>]   8.91K  --.-KB/s    in 0.001s  

Last-modified header missing -- time-stamps turned off.
2024-10-14 16:41:10 (7.15 MB/s) - ‘openfold/resources/stereo_chemical_props.txt’ saved [9119/9119]

running install
/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/setuptools/command/easy_install.py:156: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running bdist_egg
running egg_info
writing openfold.egg-info/PKG-INFO
writing dependency_links to openfold.egg-info/dependency_links.txt
writing top-level names to openfold.egg-info/top_level.txt
reading manifest file 'openfold.egg-info/SOURCES.txt'
adding license file 'LICENSE'
writing manifest file 'openfold.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying openfold/resources/stereo_chemical_props.txt -> build/lib.linux-x86_64-3.10/openfold/resources
running build_ext
/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/utils/cpp_extension.py:424: UserWarning: There are no g++ version bounds defined for CUDA version 12.1
  warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
building 'attn_core_inplace_cuda' extension
Emitting ninja build file /media/Data/binaries/github/openfold-pl_upgrades/build/temp.linux-x86_64-3.10/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] /usr/bin/nvcc  -I/media/Data/binaries/github/openfold-pl_upgrades/openfold/utils/kernel/csrc/ -I/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include -I/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/TH -I/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/THC -I/media/Data/binaries/miniconda3/envs/openfold/include/python3.10 -c -c /media/Data/binaries/github/openfold-pl_upgrades/openfold/utils/kernel/csrc/softmax_cuda_kernel.cu -o /media/Data/binaries/github/openfold-pl_upgrades/build/temp.linux-x86_64-3.10/openfold/utils/kernel/csrc/softmax_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -std=c++17 -maxrregcount=50 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=attn_core_inplace_cuda -D_GLIBCXX_USE_CXX11_ABI=0
FAILED: /media/Data/binaries/github/openfold-pl_upgrades/build/temp.linux-x86_64-3.10/openfold/utils/kernel/csrc/softmax_cuda_kernel.o 
/usr/bin/nvcc  -I/media/Data/binaries/github/openfold-pl_upgrades/openfold/utils/kernel/csrc/ -I/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include -I/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/TH -I/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/THC -I/media/Data/binaries/miniconda3/envs/openfold/include/python3.10 -c -c /media/Data/binaries/github/openfold-pl_upgrades/openfold/utils/kernel/csrc/softmax_cuda_kernel.cu -o /media/Data/binaries/github/openfold-pl_upgrades/build/temp.linux-x86_64-3.10/openfold/utils/kernel/csrc/softmax_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -std=c++17 -maxrregcount=50 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=attn_core_inplace_cuda -D_GLIBCXX_USE_CXX11_ABI=0
/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/pybind11/detail/../cast.h: In function ‘typename pybind11::detail::type_caster<typename pybind11::detail::intrinsic_type<T>::type>::cast_op_type<T> pybind11::detail::cast_op(make_caster<T>&)’:
/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected template-name before ‘<’ token
   45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
      |                                                                                                                        ^
/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/pybind11/detail/../cast.h:45:120: error: expected identifier before ‘<’ token
/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/pybind11/detail/../cast.h:45:123: error: expected primary-expression before ‘>’ token
   45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
      |                                                                                                                           ^
/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/include/pybind11/detail/../cast.h:45:126: error: expected primary-expression before ‘)’ token
   45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
      |                                                                                                                              ^
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
    subprocess.run(
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/media/Data/binaries/github/openfold-pl_upgrades/setup.py", line 113, in <module>
    setup(
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/setuptools/command/install.py", line 74, in run
    self.do_egg_install()
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/setuptools/command/install.py", line 116, in do_egg_install
    self.run_command('bdist_egg')
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/setuptools/command/bdist_egg.py", line 164, in run
    cmd = self.call_command('install_lib', warn_dir=0)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/setuptools/command/bdist_egg.py", line 150, in call_command
    self.run_command(cmdname)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/setuptools/command/install_lib.py", line 11, in run
    self.build()
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/command/install_lib.py", line 107, in build
    self.run_command('build_ext')
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/command/build_ext.py", line 340, in run
    self.build_extensions()
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 873, in build_extensions
    build_ext.build_extensions(self)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/command/build_ext.py", line 449, in build_extensions
    self._build_extensions_serial()
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/command/build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
    _build_ext.build_extension(self, ext)
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/distutils/command/build_ext.py", line 529, in build_extension
    objects = self.compiler.compile(sources,
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 686, in unix_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "/media/Data/binaries/miniconda3/envs/openfold/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
Download CUTLASS, required for Deepspeed Evoformer attention kernel
Cloning into 'cutlass'...
remote: Enumerating objects: 6103, done.
remote: Counting objects: 100% (6103/6103), done.
remote: Compressing objects: 100% (1797/1797), done.
remote: Total 6103 (delta 3528), reused 4982 (delta 3018), pack-reused 0 (from 0)
Receiving objects: 100% (6103/6103), 27.71 MiB | 4.72 MiB/s, done.
Resolving deltas: 100% (3528/3528), done.
To make your changes take effect please reactivate your environment
To make your changes take effect please reactivate your environment

This is where I am stuck -- don't really know what to do with the "Error compiling objects for extension". I have already looked at #403 , #462 , and #477 and have done my best to implement their suggestions, but obviously do not have a fully working environment.

vaclavhanzl commented 10 hours ago

@blakemertz Are you sure you are using your OS's gcc? Could you please activate your environment and try which gcc ? And gcc -v ? And should the version happen to be 13.3, could you please try mamba install gcc=12.4 ? This fixed it for me.

blakemertz commented 6 hours ago

@vaclavhanzl thanks for responding. My OS gcc is v 12 -- I specifically deleted the existing symlink to gcc14 and recreated it to gcc12, checking with gcc -v in both my OS and in my openfold environment. I will double-check again and also try installing gcc=12.4 with mamba and let you know if that fixes the issue.