predict same structure with different random seed

google-deepmind / alphafold3

AlphaFold 3 inference pipeline.

Other

5.07k stars 563 forks source link

predict same structure with different random seed #86

Closed InvincibleZZH closed 2 days ago

InvincibleZZH commented 2 days ago

After sequence alignment, i want to use different random seed to get different structures. Is there a way to store the MSA results to jump over the alignment step and directly predict structures?

smg3d commented 2 days ago

You can use the xxxx_data.json, which is saved after the MSA calculations, as input for further inferences jobs.

Taking the 2pv7 example, this file is : output/2pv7/2pv7_data.json

smg3d commented 2 days ago

To confirm that it worked as expected, I once did that test :

did a full run (MSA+inference) with seed=1
did a second run, using as input the saved xxxx_data.json from run 1, using seed=1 (automatically recognized the MSA in the input : I1120 00:53:07.329561 139822671052800 pipeline.py:404] Skipping MSA and template search for protein chain A because it already has MSAs and templates.)

diff run-01/output/lrat/ranking_scores.csv run-02/output/lrat/ranking_scores.csv
diff run-01/output/lrat/lrat_model.cif run-02/output/lrat/lrat_model.cif
330c330
< _ma_model_list.model_group_name "AlphaFold-beta-20231127 (3.0.0 @ 2024-11-20 00:34:31)"
---
> _ma_model_list.model_group_name "AlphaFold-beta-20231127 (3.0.0 @ 2024-11-20 00:53:49)"

Augustin-Zidek commented 2 days ago

Fully agreed with the answer by @smg3d, thanks for answering!

jkosinski commented 2 days ago

If needed, you can use my fork at https://github.com/jkosinski/alphafold3, which introduces the --num_seeds flag to run_alphafold.py. This flag allows adding seeds to the JSON input dynamically.

Step 1

Generate the MSA using:

--norun_inference

with a single seed.

Step 2

Run inference with:

--json_path=<json from step 1>
--norun_data_pipeline \
--num_seeds=1000