run model relaxation only?

nmrworker commented 1 year ago

Is it possible to run relaxation of calculated models without going through the whole calculation? That is, if I run calculation with the "--models_to_relax=none" option first, and later when I want to relax all the calculation structures, can I do it without doing the whole calculation with "--models_to_relax=all" option?

smturzo commented 1 year ago

You can write a python script (that calls the amber relax used in AF2) to do this. Here's an example below (Note that you need to have the same python AF2 environment as AF2). Additionally may need to tweak the code Script Usage: python alphafold_amber_relax.py _your_af2_predictedstructure.pdb -output_dir ./

Script:

import json
import os
import pathlib
import pickle
import random
import sys
import time
import argparse
from typing import Dict

from absl import app
from absl import flags
from absl import logging
from alphafold.common import protein
from alphafold.common import residue_constants
from alphafold.data import pipeline
from alphafold.data import templates
from alphafold.model import data
from alphafold.model import config
from alphafold.model import model
from alphafold.relax import relax
import numpy as np

def relax_with_amber(model_name,output_dir):
        RELAX_MAX_ITERATIONS = 0
        RELAX_ENERGY_TOLERANCE = 2.39
        RELAX_STIFFNESS = 10.0
        RELAX_EXCLUDE_RESIDUES = []
        RELAX_MAX_OUTER_ITERATIONS = 20
        amber_relax = relax.AmberRelaxation(max_iterations=RELAX_MAX_ITERATIONS, tolerance=RELAX_ENERGY_TOLERANCE, stiffness=RELAX_STIFFNESS, exclude_residues=RELAX_EXCLUDE_RESIDUES, max_outer_iterations=RELAX_MAX_OUTER_ITERATIONS)
        unrelaxed_protein = model_name
        print(unrelaxed_protein)
        with open(str(model_name)) as f:
                test_prot = protein.from_pdb_string(f.read())
                pdb_min, _, _ = amber_relax.process(prot=test_prot)
                print(pdb_min)
                with open(str(output_dir)+'amber_r_'+str(model_name).split('/')[-1],'w+') as rel_f:
                        rel_f.write(str(pdb_min))

parser = argparse.ArgumentParser(description='Run Amber Relax (AlphaFold2 settings) on any structure')
parser.add_argument('model_name', help = 'PDB File to Relax')
parser.add_argument('-output_dir', type = str, help = 'Output path for the relaxed model')
args = parser.parse_args()
model  = args.model_name
relax_with_amber(model,args.output_dir)

nmrworker commented 1 year ago

Thanks. I'll try it out. If I want to relax the whole set of structures, will this script accept regexp formats to represent the whole ensemble of pdbs?

nmrworker commented 1 year ago

I tried it out and got an error: ModuleNotFoundError: No module named 'alphafold'. I guess it may have to do with the "same python AF2 environment as AF2" in your post. Since I'm not the computer science expert, how can I ensure that? I run AF2 by the command "python docker/run_docker.py --fasta_paths=my.fasta --max_template_date=2020-05-14".

smturzo commented 1 year ago

Hi, no problem at all. A lot of it is going to be trial an error. But you are correct, you have run it within the AF2 directory. You will also need to git clone the AF2 repository if you haven't yet. The other option that will make life easier is shown below. I commented where made the changes:

import json
import os
import pathlib
import pickle
import random
import sys
import time
import argparse
from typing import Dict

from absl import app
from absl import flags
from absl import logging
from alphafold.common import protein
from alphafold.common import residue_constants
from alphafold.data import pipeline
from alphafold.data import templates
from alphafold.model import data
from alphafold.model import config
from alphafold.model import model
from alphafold.relax import relax
import numpy as np

parser = argparse.ArgumentParser(description='Run Amber Relax (AlphaFold2 settings) on any structure')
parser.add_argument('model_name', help = 'PDB File to Relax')
parser.add_argument('-output_dir', type = str, help = 'Output path for the relaxed model')
# An option to tell where the AF2 code is. 
# Also change the **default** to wherever your cloned AF2 is.
parser.add_argument("--af2_dir", default="**your/path/to/alphafold-2.3.1/**", help="AlphaFold code directory")

args = parser.parse_args()
# This line below should put the AF code in your environment path
sys.path.append(args.af2_dir) 
model  = args.model_name

def relax_with_amber(model_name,output_dir):
        RELAX_MAX_ITERATIONS = 0
        RELAX_ENERGY_TOLERANCE = 2.39
        RELAX_STIFFNESS = 10.0
        RELAX_EXCLUDE_RESIDUES = []
        RELAX_MAX_OUTER_ITERATIONS = 20
        amber_relax = relax.AmberRelaxation(max_iterations=RELAX_MAX_ITERATIONS, tolerance=RELAX_ENERGY_TOLERANCE, stiffness=RELAX_STIFFNESS, exclude_residues=RELAX_EXCLUDE_RESIDUES, max_outer_iterations=RELAX_MAX_OUTER_ITERATIONS)
        unrelaxed_protein = model_name
        print(unrelaxed_protein)
        with open(str(model_name)) as f:
                test_prot = protein.from_pdb_string(f.read())
                pdb_min, _, _ = amber_relax.process(prot=test_prot)
                print(pdb_min)
                with open(str(output_dir)+'amber_r_'+str(model_name).split('/')[-1],'w+') as rel_f:
                        rel_f.write(str(pdb_min))

relax_with_amber(model,args.output_dir)

nmrworker commented 1 year ago

Thanks. I found out that I need to add additional parameters for "use_gpu" in the AmberRelaxation call. Otherwise it runs well. What about relaxing an ensemble of pdb, such as "python alphafold_amber_relax.py unrelaxed*.pdb"?

smturzo commented 1 year ago

in that case you will have to do this: parser.add_argument('model_name', help = 'A text file containing a list of PDB Files to Relax')

You can make this file by this on the terminal like this: `ls unrelaxed*.pdb > model_list`

... (other code) ... models = args.model_name .... (other code) ....

for model in models:
    relax_with_amber(model,args.output_dir)

Hopefully this will relax the pdbs sequentially.

For example if you have a file that looks like this:

unrelaxed_state_1.pdb
unrelaxed_state_2.pdb
unrelaxed_state_3.pdb

Then the script will relax and output like this: amberr unrelaxed_state_1.pdb # this will be relaxed first amberr unrelaxed_state_2.pdb # then this amberr unrelaxed_state_3.pdb # then this

nmrworker commented 1 year ago

Thanks. It works with the addition of read each line in the file. Wonderful script!

google-deepmind / alphafold