Open cifnik opened 10 months ago
I had the same problem with an RTX4090. I couldn't get the main environment.yml
to install a version of torch that supported the 4090's architecture.
I found installing ninja helped a little too, which i see is missing in your environment. I started playing around with some of the experimental branches and eventually got a stable conda environment with cuda 11.8, I've only just started testing, but it's going well so far. One key part was rolling back flash attention, unsure why, but it works when rolled back to v2.0 (commit hash 4f285b3).
name: openfold_plup118
channels:
- pytorch
- bioconda
- nvidia
- conda-forge
dependencies:
- _libgcc_mutex=0.1=conda_forge
- _openmp_mutex=4.5=2_kmp_llvm
- absl-py=2.1.0=pyhd8ed1ab_0
- appdirs=1.4.4=pyh9f0ad1d_0
- aria2=1.37.0=h347180d_1
- aws-c-auth=0.7.8=h538f98c_2
- aws-c-cal=0.6.9=h5d48c4d_2
- aws-c-common=0.9.10=hd590300_0
- aws-c-compression=0.2.17=h7f92143_7
- aws-c-event-stream=0.3.2=h0bcb0bb_8
- aws-c-http=0.7.14=hd268abd_3
- aws-c-io=0.13.36=he0cd244_2
- aws-c-mqtt=0.9.10=h35285c7_2
- aws-c-s3=0.4.4=h0448019_0
- aws-c-sdkutils=0.1.13=h7f92143_0
- aws-checksums=0.1.17=h7f92143_6
- awscli=2.15.45=py39hf3d152e_0
- awscrt=0.19.19=py39hf0530f4_2
- biopython=1.79=py39hb9d737c_3
- blas=2.116=mkl
- blas-devel=3.9.0=16_linux64_mkl
- brotli-python=1.1.0=py39h3d6467e_1
- bzip2=1.0.8=hd590300_5
- c-ares=1.28.1=hd590300_0
- ca-certificates=2024.2.2=hbcca054_0
- certifi=2024.2.2=pyhd8ed1ab_0
- cffi=1.16.0=py39h7a31438_0
- charset-normalizer=3.3.2=pyhd8ed1ab_0
- click=8.1.7=unix_pyh707e725_0
- colorama=0.4.6=pyhd8ed1ab_0
- contextlib2=21.6.0=pyhd8ed1ab_0
- cryptography=40.0.2=py39h079d5ae_0
- cuda-cudart=11.8.89=0
- cuda-cupti=11.8.87=0
- cuda-libraries=11.8.0=0
- cuda-nvrtc=11.8.89=0
- cuda-nvtx=11.8.86=0
- cuda-runtime=11.8.0=0
- cudatoolkit=11.8.0=h4ba93d1_13
- distro=1.8.0=pyhd8ed1ab_0
- docker-pycreds=0.4.0=py_0
- docutils=0.19=py39hf3d152e_1
- fftw=3.3.10=nompi_hc118613_108
- filelock=3.14.0=pyhd8ed1ab_0
- fsspec=2024.3.1=pyhca7485f_0
- git=2.45.0=pl5321hef9f9f3_1
- gitdb=4.0.11=pyhd8ed1ab_0
- gitpython=3.1.43=pyhd8ed1ab_0
- gmp=6.3.0=h59595ed_1
- gmpy2=2.1.5=py39h03b5d36_0
- hhsuite=3.3.0=py39pl5321he10ea66_10
- hmmer=3.3.2=hdbdd923_4
- icu=73.2=h59595ed_0
- idna=3.7=pyhd8ed1ab_0
- ihm=1.0=py39hd1e30aa_0
- jinja2=3.1.3=pyhd8ed1ab_0
- jmespath=1.0.1=pyhd8ed1ab_0
- kalign2=2.04=h031d066_6
- keyutils=1.6.1=h166bdaf_0
- krb5=1.21.2=h659d440_0
- ld_impl_linux-64=2.40=h55db66e_0
- libabseil=20240116.2=cxx17_h59595ed_0
- libblas=3.9.0=16_linux64_mkl
- libcblas=3.9.0=16_linux64_mkl
- libcublas=11.11.3.6=0
- libcufft=10.9.0.58=0
- libcufile=1.9.1.3=0
- libcurand=10.3.5.147=0
- libcurl=8.7.1=hca28451_0
- libcusolver=11.4.1.48=0
- libcusparse=11.7.5.86=0
- libedit=3.1.20191231=he28a2e2_2
- libev=4.33=hd590300_2
- libexpat=2.6.2=h59595ed_0
- libffi=3.4.2=h7f98852_5
- libgcc=7.2.0=h69d50b8_2
- libgcc-ng=13.2.0=h77fa898_7
- libgfortran-ng=13.2.0=h69a702a_7
- libgfortran5=13.2.0=hca663fb_7
- libhwloc=2.10.0=default_h2fb2949_1000
- libiconv=1.17=hd590300_2
- liblapack=3.9.0=16_linux64_mkl
- liblapacke=3.9.0=16_linux64_mkl
- libnghttp2=1.58.0=h47da74e_1
- libnpp=11.8.0.86=0
- libnsl=2.0.1=hd590300_0
- libnvjpeg=11.9.0.86=0
- libprotobuf=4.25.3=h08a7969_0
- libsqlite=3.45.3=h2797004_0
- libssh2=1.11.0=h0841786_0
- libstdcxx-ng=13.2.0=hc0a3c3a_7
- libuuid=2.38.1=h0b41bf4_0
- libxcrypt=4.4.36=hd590300_1
- libxml2=2.12.6=h232c23b_2
- libzlib=1.2.13=hd590300_5
- lightning-utilities=0.11.2=pyhd8ed1ab_0
- llvm-openmp=15.0.7=h0cdce71_0
- markupsafe=2.1.5=py39hd1e30aa_0
- mkl=2022.1.0=h84fe81f_915
- mkl-devel=2022.1.0=ha770c72_916
- mkl-include=2022.1.0=h84fe81f_915
- ml-collections=0.1.1=pyhd8ed1ab_0
- modelcif=0.7=pyhd8ed1ab_0
- mpc=1.3.1=hfe3b2da_0
- mpfr=4.2.1=h9458935_1
- mpmath=1.3.0=pyhd8ed1ab_0
- msgpack-python=1.0.7=py39h7633fee_0
- ncurses=6.4.20240210=h59595ed_0
- networkx=3.2.1=pyhd8ed1ab_0
- numpy=1.26.4=py39h474f0d3_0
- ocl-icd=2.3.2=hd590300_1
- ocl-icd-system=1.0.0=1
- openmm=7.7.0=py39h15fbce5_1
- openssl=3.3.0=hd590300_0
- packaging=24.0=pyhd8ed1ab_0
- pandas=2.2.2=py39hddac248_0
- pathtools=0.1.2=py_1
- pcre2=10.43=hcad00b1_0
- pdbfixer=1.8.1=pyh6c4a22f_0
- perl=5.32.1=7_hd590300_perl5
- pip=24.0=pyhd8ed1ab_0
- pretty_errors=1.2.25=pyhd8ed1ab_0
- prompt-toolkit=3.0.38=pyha770c72_0
- prompt_toolkit=3.0.38=hd8ed1ab_0
- protobuf=4.25.3=py39h1be52a0_0
- psutil=5.9.8=py39hd1e30aa_0
- pycparser=2.22=pyhd8ed1ab_0
- pyopenssl=23.1.1=pyhd8ed1ab_0
- pysocks=1.7.1=pyha2e5f31_6
- python=3.9.19=h0755675_0_cpython
- python-dateutil=2.8.2=pyhd8ed1ab_0
- python-tzdata=2024.1=pyhd8ed1ab_0
- python_abi=3.9=4_cp39
- pytorch=2.1.2=py3.9_cuda11.8_cudnn8.7.0_0
- pytorch-cuda=11.8=h7e8668a_5
- pytorch-lightning=2.2.2=pyhd8ed1ab_0
- pytorch-mutex=1.0=cuda
- pytz=2024.1=pyhd8ed1ab_0
- pyyaml=5.4.1=py39hb9d737c_4
- readline=8.2=h8228510_1
- requests=2.31.0=pyhd8ed1ab_0
- ruamel.yaml=0.17.21=py39h72bdee0_3
- ruamel.yaml.clib=0.2.7=py39hd1e30aa_2
- s2n=1.4.0=h06160fa_0
- scipy=1.13.0=py39haf93ffa_1
- sentry-sdk=2.1.1=pyhd8ed1ab_0
- setproctitle=1.3.3=py39hd1e30aa_0
- setuptools=59.5.0=py39hf3d152e_0
- six=1.16.0=pyh6c4a22f_0
- smmap=5.0.0=pyhd8ed1ab_0
- sympy=1.12=pypyh9d50eac_103
- tbb=2021.12.0=h00ab1b0_0
- tk=8.6.13=noxft_h4845f30_101
- torchmetrics=1.4.0=pyhd8ed1ab_0
- torchtriton=2.1.0=py39
- tqdm=4.62.2=pyhd8ed1ab_0
- typing-extensions=4.11.0=hd8ed1ab_0
- typing_extensions=4.11.0=pyha770c72_0
- tzdata=2024a=h0c530f3_0
- urllib3=1.26.18=pyhd8ed1ab_0
- wandb=0.16.5=pyhd8ed1ab_0
- wcwidth=0.2.13=pyhd8ed1ab_0
- wheel=0.43.0=pyhd8ed1ab_1
- xz=5.2.6=h166bdaf_0
- yaml=0.2.5=h7f98852_2
- zstd=1.5.6=ha6fb4c9_0
- pip:
- annotated-types==0.6.0
- deepspeed==0.12.4
- dllogger==1.0.0
- dm-tree==0.1.6
- einops==0.8.0
- flash-attn==2.0.0.post1
- hjson==3.1.0
- ninja==1.11.1.1
- openfold==2.0.0
- py-cpuinfo==9.0.0
- pydantic==2.7.1
- pydantic-core==2.18.2
- pynvml==11.5.0
variables:
CUTLASS_PATH: /mnt/nvme1n1p1/openfold_install/pl_upgrades/cutlass
KMP_AFFINITY: none
I am trying to use openfold on a machine with an rtx 4090 with cuda driver 12.1 I install the package using the environment file and I get this problem at the time of installing attn_cuda
(copying the last part, it looks like it has to do with the architecture of my GPU? I could install in older ones with cuda driver 10.2)
Installing flash-attn alone I manage to get it but then runnging a small test:
python run_pretrained_openfold.py fasta_dir data/pdb_mmcif/mmcif_files/ --uniref90_database_path data/uniref90/uniref90.fasta --mgnify_database_path data/mgnify/mgy_clusters_2018_12.fa --pdb70_database_path data/pdb70/pdb70 --uniclust30_database_path data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 --output_dir ./ --bfd_database_path data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --model_device "cuda:0" --jackhmmer_binary_path lib/conda/envs/openfold_venv/bin/jackhmmer --hhblits_binary_path lib/conda/envs/openfold_venv/bin/hhblits --hhsearch_binary_path lib/conda/envs/openfold_venv/bin/hhsearch --kalign_binary_path lib/conda/envs/openfold_venv/bin/kalign --config_preset "model_1_ptm" --openfold_checkpoint_path openfold/resources/openfold_params/finetuning_ptm_2.pt
I get the following error: