aqlaboratory / openfold

Trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold 2
Apache License 2.0
2.82k stars 550 forks source link

multimer installation weird behaviour #399

Open eguarnera opened 10 months ago

eguarnera commented 10 months ago

Thanks for the amazing work you put together. I'm trying to run openfold multimer branch with the following command

python3 run_pretrained_openfold.py \ ./FASTA \ data/pdb_mmcif/mmcif_files/ \ --uniref90_database_path data/uniref90/uniref90.fasta \ --mgnify_database_path data/mgnify/mgy_clusters_2022_05.fa \ --pdb_seqres_database_path data/pdb_seqres/pdb_seqres.txt \ --uniref30_database_path data/uniref30/UniRef30_2021_03 \ --uniprot_database_path data/uniprot/uniprot.fasta \ --bfd_database_path data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \ --jackhmmer_binary_path /SW/python/miniconda3/x86_64/envs/openfold_env/bin/jackhmmer \ --hhblits_binary_path /SW/python/miniconda3/x86_64/envs/openfold_env/bin/hhblits \ --hmmsearch_binary_path /SW/python/miniconda3/x86_64/envs/openfold_env/bin/hmmsearch \ --hmmbuild_binary_path /SW/python/miniconda3/x86_64/envs/openfold_env/bin/hmmbuild \ --kalign_binary_path /SW/python/miniconda3/x86_64/envs/openfold_env/bin/kalign \ --config_preset "model_1_multimer_v3" \ --model_device "cuda:0" \ --output_dir ./

I'm observing a strange behaviour, the input fasta file contains three chains, the alignments are produced but saved into ./alignments/alignments/ and later on I get an error as the alignments cannot be found in ./alignments/

Any idea?

[2024-01-24 10:42:29,197] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect) INFO:/cdata/TECHNO/OPENFOLD/multimer/openfold/openfold/utils/script_utils.py:Successfully loaded JAX parameters at openfold/resources/pa rams/params_model_1_multimer_v3.npz... INFO:/cdata/TECHNO/OPENFOLD/multimer/openfold/run_pretrained_openfold.py:Generating alignments for R1.1_HC... INFO:/cdata/TECHNO/OPENFOLD/multimer/openfold/run_pretrained_openfold.py:Generating alignments for R1.1_LC... INFO:/cdata/TECHNO/OPENFOLD/multimer/openfold/run_pretrained_openfold.py:Generating alignments for CDXXX... Traceback (most recent call last): File "/cdata/TECHNO/OPENFOLD/multimer/openfold/run_pretrained_openfold.py", line 473, in main(args) File "/cdata/TECHNO/OPENFOLD/multimer/openfold/run_pretrained_openfold.py", line 282, in main feature_dict = generate_feature_dict( File "/cdata/TECHNO/OPENFOLD/multimer/openfold/run_pretrained_openfold.py", line 154, in generate_feature_dict feature_dict = data_processor.process_fasta( File "/cdata/TECHNO/OPENFOLD/multimer/openfold/openfold/data/data_pipeline.py", line 1250, in process_fasta chain_features = self._process_single_chain( File "/cdata/TECHNO/OPENFOLD/multimer/openfold/openfold/data/data_pipeline.py", line 1166, in _process_single_chain raise ValueError(f"Alignments for {chain_id} not found...") ValueError: Alignments for R1.1_HC not found...

jnwei commented 9 months ago

Hi, thanks for your interest in our work and for raising this issue.

This error arises when a local path is specified for the output_dir rather than a full path. We'll have a patch to fix this issue in the next update.

For your current work, you can reuse the pre-computed alignments by using the --use_precomputed_alignments flag for the command, and pointing it to the directory with your computed alignments.