Closed InvincibleZZH closed 2 days ago
You can use the xxxx_data.json, which is saved after the MSA calculations, as input for further inferences jobs.
Taking the 2pv7 example, this file is :
output/2pv7/2pv7_data.json
To confirm that it worked as expected, I once did that test :
I1120 00:53:07.329561 139822671052800 pipeline.py:404] Skipping MSA and template search for protein chain A because it already has MSAs and templates.
)diff run-01/output/lrat/ranking_scores.csv run-02/output/lrat/ranking_scores.csv
diff run-01/output/lrat/lrat_model.cif run-02/output/lrat/lrat_model.cif
330c330
< _ma_model_list.model_group_name "AlphaFold-beta-20231127 (3.0.0 @ 2024-11-20 00:34:31)"
---
> _ma_model_list.model_group_name "AlphaFold-beta-20231127 (3.0.0 @ 2024-11-20 00:53:49)"
Fully agreed with the answer by @smg3d, thanks for answering!
If needed, you can use my fork at https://github.com/jkosinski/alphafold3, which introduces the --num_seeds
flag to run_alphafold.py
. This flag allows adding seeds to the JSON input dynamically.
Generate the MSA using:
--norun_inference
with a single seed.
Run inference with:
--json_path=<json from step 1>
--norun_data_pipeline \
--num_seeds=1000
After sequence alignment, i want to use different random seed to get different structures. Is there a way to store the MSA results to jump over the alignment step and directly predict structures?