hpcaitech / FastFold

Optimizing AlphaFold Training and Inference on GPU Clusters
Apache License 2.0
557 stars 86 forks source link

Error when model parameters from AlphaFold v2.3.0 (multimer_v3.npz) are used #126

Closed kashyapchhatbar closed 1 year ago

kashyapchhatbar commented 1 year ago

The most recent update from AlphaFold v2.3.0 includes updated parameters

Running inference.py using these update parameters (v3) throws the following error. The same command is successful for parameters from previous versions.

Multimer command

python ~/FastFold/inference.py multimer_query.fasta \
        ~/alphafold-2.3.0_data/pdb_mmcif/mmcif_files/ \
        --use_precomputed_alignments ./alignments \
        --output_dir ./multimer_query_fastfold_v3 \
        --gpus 1 --model_preset multimer \
        --uniref90_database_path ~/alphafold-2.3.0_data/uniref90/uniref90.fasta \
        --mgnify_database_path ~/alphafold-2.3.0_data/mgnify/mgy_clusters_2022_05.fa \
        --pdb70_database_path ~/alphafold-2.3.0_data/pdb70/pdb70 \
        --uniclust30_database_path ~/alphafold-2.3.0_data/uniref30/UniRef30_2021_03 \
        --bfd_database_path ~/alphafold-2.3.0_data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
        --uniprot_database_path ~/alphafold-2.3.0_data/uniprot/uniprot.fasta \
        --pdb_seqres_database_path ~/alphafold-2.3.0_data/pdb_seqres/pdb_seqres.txt  \
        --param_path ~/alphafold-2.3.0_data/params/params_model_1_multimer_v3.npz \
        --model_name model_1_multimer_v3 \
        --jackhmmer_binary_path `which jackhmmer` \
        --hhblits_binary_path `which hhblits` \
        --hhsearch_binary_path `which hhsearch` \
        --kalign_binary_path `which kalign` \
        --chunk_size 8 --inplace

Error is pasted below

[12/22/22 13:28:14] INFO     colossalai - colossalai - INFO: ~/conda/envs/fastfold/lib/python3.8/site-packages/colossalai/context/parallel_context.py:557
                             set_seed
                    INFO     colossalai - colossalai - INFO: initialized seed on rank 0, numpy: 1024, python random: 1024, ParallelMode.DATA: 1024, ParallelMode.TENSOR: 1024,the default
                             parallel seed is ParallelMode.DATA.
                    INFO     colossalai - colossalai - INFO: ~/conda/envs/fastfold/lib/python3.8/site-packages/colossalai/initialize.py:117 launch
                    INFO     colossalai - colossalai - INFO: Distributed environment is initialized, data parallel size: 1, pipeline parallel size: 1, tensor parallel size: 1
Traceback (most recent call last):
  File "~/FastFold/inference.py", line 513, in <module>
    main(args)
  File "~/FastFold/inference.py", line 148, in main
    inference_multimer_model(args)
  File "~/FastFold/inference.py", line 276, in inference_multimer_model
    torch.multiprocessing.spawn(inference_model, nprocs=args.gpus, args=(args.gpus, result_q, batch, args))
  File "~/conda/envs/fastfold/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "~/conda/envs/fastfold/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
    while not context.join():
  File "~/conda/envs/fastfold/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 160, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "~/conda/envs/fastfold/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "~/FastFold/inference.py", line 123, in inference_model
    import_jax_weights_(model, args.param_path, version=args.model_name)
  File "~/FastFold/fastfold/utils/import_weights.py", line 580, in import_jax_weights_
    assert len(incorrect) == 0
AssertionError
Shenggan commented 1 year ago

Thanks for reporting this issue, we will test it and fix it as soon as possible.

Shenggan commented 1 year ago

We have added support in #128, you can test it with the latest main branch and reopen this issue if you have problems

kashyapchhatbar commented 1 year ago

Many thanks for the update.

I have a query regarding processing of multimer queries. When I ran a multimer query, the output PDB contains a single chain (concatenation of AA sequence). When the default alphafold process is ran with multimer preset, the output PDB contains two separate chains. Am I making mistake in setting fastfold parameters?

kashyapchhatbar commented 1 year ago

For example, the prediction in the left window is from fastfold using the command

python ~/FastFold_params_v2.3/inference.py multimer_query.fasta \
        ~/alphafold-2.3.0_data/pdb_mmcif/mmcif_files/ \
        --use_precomputed_alignments ./alignments \
        --output_dir ./multimer_query_fastfold_v3 \
        --gpus 1 --model_preset multimer \
        --uniref90_database_path ~/alphafold-2.3.0_data/uniref90/uniref90.fasta \
        --mgnify_database_path ~/alphafold-2.3.0_data/mgnify/mgy_clusters_2022_05.fa \
        --pdb70_database_path ~/alphafold-2.3.0_data/pdb70/pdb70 \
        --uniclust30_database_path ~/alphafold-2.3.0_data/uniref30/UniRef30_2021_03 \
        --bfd_database_path ~/alphafold-2.3.0_data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
        --uniprot_database_path ~/alphafold-2.3.0_data/uniprot/uniprot.fasta \
        --pdb_seqres_database_path ~/alphafold-2.3.0_data/pdb_seqres/pdb_seqres.txt  \
        --param_path ~/alphafold-2.3.0_data/params/params_model_1_multimer_v3.npz \
        --model_name model_1_multimer_v3 \
        --jackhmmer_binary_path `which jackhmmer` \
        --hhblits_binary_path `which hhblits` \
        --hhsearch_binary_path `which hhsearch` \
        --kalign_binary_path `which kalign` \
        --chunk_size 8 --inplace

and the prediction in the right window is from alphafold using the command

python alphafold/docker/run_docker.py \
    --fasta_paths=multimer_query.fasta \
    --max_template_date=2022_12_31 \
    --model_preset=multimer \
    --data_dir=storage/alphafold-2.3.0_data

The folding between the two chains is not predicted if the AAs are concatenated together. In this instance, FastFold predicts the structure in monomer mode despite the preset being set to multimer.

Many thanks, Kashyap

fastfold_vs_alphafold_multimer

Shenggan commented 1 year ago

I fixed some stuff in #130, you can test it with the latest main branch.

kashyapchhatbar commented 1 year ago

Thanks, I will give it a go.