isambard-uob / isambard

Intelligent System for Analysis, Model Building And Rational Design of biomolecules.
https://isambard-uob.github.io/isambard/
MIT License
20 stars 7 forks source link

Is there a way to generate all register combinations for a given dimer? #16

Closed jmrussell closed 4 years ago

jmrussell commented 4 years ago

The researcher I am working with has asked me to come up with all possible register positions for candidate PDBs for a structure we are interested in. I was able to create a pdb of a dimer I am attempting to look at with the following code:

import sys
import isambard.specifications as specifications
import isambard.modelling as modelling
import isambard.optimisation
import budeff
import isambard.optimisation.evo_optimizers as ev_opts
from isambard.optimisation.evo_optimizers import Parameter

specification = specifications.CoiledCoil.from_parameters
sequences = [
    'QAQAQAQAQAQAQAQAQAQAQAQAQAQA',
    'QAQAQAQAQAQAQAQAQAQAQAQAQAQA'
]
parameters = [
    Parameter.static('Oligomeric State', 2),
    Parameter.static('Helix Length', 28),
    Parameter.dynamic('Radius', 5.0, 1.0),
    Parameter.dynamic('Pitch', 200, 60),
    Parameter.dynamic('PhiCA', 283, 27),  # 283 is equivalent a g position
]

def get_buff_total_energy(ampal_object):
    return budeff.get_internal_energy(ampal_object).total_energy

opt_ga = ev_opts.GA(specification, sequences, parameters, get_buff_total_energy)
opt_ga.run_opt(100, 5, cores=8)
optimized_model = opt_ga.best_model

I noticed in the tutorial you give the option on changing the register here:https://gist.github.com/ChrisWellsWood/41a8baf9785e1b39a39c70d651d98129

I am new to this so I might be wrong here, but it seems to me you just change the order of the sequence such that the 'F' amino acid is now listed first in the dimer.

With the repeat I have, is the only option I have to change the initial sequence to AQ(n)?

ChrisWellsWood commented 4 years ago

Hi @jmrussell, phi Ca basically defines the register of the first residue, so if you want to try the same sequence with all registers you can uses these ideal values for phi ca:

ideal_phica_for_register =
   {"a": 25.714285714285715,
    "b": 128.57142857142856,
    "c": 231.42857142857144,
    "d": 334.2857142857143,
    "e": 77.14285714285714,
    "f": 180.0,
    "g": 282.85714285714283,
    }

You can calculate these yourself like so:

delta = 360/7
ideal_phicas = [n * delta - (delta/2) for n in range(1, 8)]
>>> [25.714285714285715, 77.14285714285714, 128.57142857142856, 180.0, 231.42857142857144, 282.85714285714283, 334.2857142857143]
# a, e, b, f, c, g, d

Does that answer your question or have I misunderstood?

jmrussell commented 4 years ago

I think I understand. Where does the ideal_phicas equation come from? Did I miss it in your documentation? I tried to look everywhere before I bothered you.

Just to double check - I would essentially use the dictionary you've provided and run specifications.CoiledCoil.from_parameters(2, 28, 2, 201, REGISTER_ADJUST['mykey']) for A:G?

Thank you for the quick response, I really appreciate that.

ChrisWellsWood commented 4 years ago

That's right, if you're doing optimisation, you might want to modify your code above like this:

for register, ideal_phica in ideal_phica_for_register.items():
    ...
    parameters = [
        Parameter.static('Oligomeric State', 2),
        Parameter.static('Helix Length', 28),
        Parameter.dynamic('Radius', 5.0, 1.0),
        Parameter.dynamic('Pitch', 200, 60),
        Parameter.dynamic('PhiCA', ideal_phica, 27),
    ]
    ...

As for where these values come from, it's based off of the geometry of 7 residues mapped onto a helix. Here's a diagram (the residues are processing into the page):

IMG_20200530_102646

We don't currently include the options to define by register, rather than raw phica, because register becomes very confusing with certain types of coiled coil. We could add a build mode for it though, if you thought it would be useful.

jmrussell commented 4 years ago

Thank you, I was able to see what I wanted to model very well based on your instruction.

I am also wondering about applying the registry shift to each molecule individually rather than both helices of my structure? e.g. molecule 1 would have a fixed registry (say "A"), and molecule 2 would have either registry "A", "B", "C".. and so on for all possible 14 orientations? I tried to find more information about do something along those lines myself but I wasn't able to find any documentation on it. If you have something, if you could just point me to it I won't take up any more of your time!

Thanks again.

ChrisWellsWood commented 4 years ago

Great, glad that was what you wanted. As for independently varying the register per helix, you need a slightly more complex specification. Here's an example with an anti-parallel dimer, it's not exactly what you want to do, but it should show you how to vary parameters per helix: https://gist.github.com/ChrisWellsWood/1457a5ea268d3522628ee8eadee5dd93

jmrussell commented 4 years ago

Hi Chris,

I am pretty new to Python and I've had some trouble getting the correct parameters added to a new class I have made. I thought I would be able to follow the tutorial you sent (since both of my molecules have identical sequences), and add the self.orientations parameter, and then change the phi_c_alphas to reflect a change in the second molecule, thus

self.phi_c_alpahs = [phica, ideal_phica]

From there, I would just add ideal_phica in when I generate my list of parameters and then complete the workflow as before, but it seems like isambard hangs up and crashes with no error. Could you provide a suggestion for what I've done wrong? I hate to bother you with what I feel like is a question about Python programming but I've been stuck on this for a week now :(.

class APSwitchRegistry(specifications.CoiledCoil):
      """ Specification for creating anti-parallel coiled coils with switching registry"""
           oligomeric_state = 2
         def __init__(self, helix_length, radius, pitch, phica, ideal_phica, test):
            super().__init__(self.oligomeric_state, auto_build=False)
            self.aas = [helix_length, helix_length]
            self.major_radii = [radius, radius]
            self.major_pitches = [pitch, pitch]
            self.phi_c_alphas = [phica, ideal_phica]
            self.orientations = [1, -1]
            self.build()

for register, ideal_phica in ideal_phica_for_register.items():
            parameters = [
            Parameter.static('Oligomeric State', 2),
            Parameter.static('Helix Length', 28),
            Parameter.dynamic('Radius', 5.0, 1.0),
            Parameter.dynamic('Pitch', 200, 60),
            Parameter.dynamic('PhiCA', 27, 27),
            Parameter.dynamic('Ideal Phica', ideal_phica, 27)
      ]

          print('running register ', register)
          opt_ga = ev_opts.GA(APSwitchRegistry, sequences, parameters, get_buff_total_energy)
          opt_ga.run_opt(100, 5, cores=20)
ChrisWellsWood commented 4 years ago

Hi @jmrussell, sorry I've just noticed this, I don't think that comments on GitHub issues get through to my email properly! It looks like there are two (related) problems with your code:

  1. Your class has an extra parameter test which you should remove:
    class APSwitchRegistry(specifications.CoiledCoil):
        """ Specification for creating anti-parallel coiled coils with switching registry"""
        oligomeric_state = 2
        def __init__(self, helix_length, radius, pitch, phica, ideal_phica):

    I imagine that's not meant to be there any more!

  2. You defined "Oligomeric State" as a parameter, but it this value is fixed so it's constant for your class (see the class attribute oligomeric_state defined above), so you don't need it in your Parameters list:
    parameters = [
            Parameter.static('Helix Length', 28),
            Parameter.dynamic('Radius', 5.0, 1.0),
            Parameter.dynamic('Pitch', 200, 60),
            Parameter.dynamic('PhiCA', 27, 27),
            Parameter.dynamic('Ideal Phica', ideal_phica, 27)
    ]

    Once I made those changes, your code ran. Just remember that your list of Parameters should match the __init__ signature of your class. One useful thing is that you can try out your parameters on your class like this:

default_parameter_values = [p.default_value for p in parameters] 
# looks like this if you print it: [28, 5.0, 200, 27, 282.85714285714283]
test_model = APSwitchRegistry(*default_parameter_values)
# looks like this if you print it: <Assembly containing 2 Polypeptides>
# The asterisk before a list means you take all the values out of the list
# and feed them as arguments to the function, it's called list unpacking

# You could even make a pdb file which you can check in pymol
with open('test_model.pdb', 'w') as outf:
    outf.write(test_model.pdb)

Does that all make sense?

jmrussell commented 4 years ago

I really hope I'm not the only person learning to program that has something explained to him and then just groans really loudly for 10 minutes. Thank you for the explanation. I did not realize that I had added the oligomeric_state line to the class definition like you did in your example and then didn't remove the parameter from my list. I was getting an error message about mismatched parameter numbers (hence the "test"...) and I couldn't figure out why. That explains it! Thanks!

The code does run without error now, but it does appear to not be doing anything... It starts the processes that I request (20) but they die immediately and use 0 resources before hanging.

ChrisWellsWood commented 4 years ago

No, you're certainly not the only person to experience the cringe/groan response, it still happens to me relatively regularly!

That's strange about the error, it ran on my computer fine yesterday, could you post your whole script?

jmrussell commented 4 years ago
class APSwitchRegistry(specifications.CoiledCoil):
      """ Specification for creating anti-parallel coiled coils with switching registry"""
           oligomeric_state = 2
         def __init__(self, helix_length, radius, pitch, phica, ideal_phica):
            super().__init__(self.oligomeric_state, auto_build=False)
            self.aas = [helix_length, helix_length]
            self.major_radii = [radius, radius]
            self.major_pitches = [pitch, pitch]
            self.phi_c_alphas = [phica, ideal_phica]
            self.orientations = [1, -1]
            self.build()

ideal_phica_for_register =
   {"a": 25.714285714285715,
    "b": 128.57142857142856,
    "c": 231.42857142857144,
    "d": 334.2857142857143,
    "e": 77.14285714285714,
    "f": 180.0,
    "g": 282.85714285714283,
    }

sequences = [
    'QAQAQAQAQAQAQAQAQAQAQAQAQAQA',
    'QAQAQAQAQAQAQAQAQAQAQAQAQAQA'
]

for register, ideal_phica in ideal_phica_for_register.items():
            parameters = [
        Parameter.static('Helix Length', 28),
        Parameter.dynamic('Radius', 5.0, 1.0),
        Parameter.dynamic('Pitch', 200, 60),
        Parameter.dynamic('PhiCA', 27, 27),
        Parameter.dynamic('Ideal Phica', ideal_phica, 27)
]
          print('running register ', register)
          opt_ga = ev_opts.GA(APSwitchRegistry, sequences, parameters, get_buff_total_energy)
          opt_ga.run_opt(100, 5, cores=20)

I am copy pasting this from a VIM instance on a work cluster so hopefully the formatting looks right.

When I run everything up to opt_ga.run_opt, it looks good. I can see that an assembly was made, all the parameters look right, and the optimization object is made as well. If I run the script as I had been, I see that it starts the scrwl4 processes and everything else, but they just hang at 0 cpu usage and appear to do nothing. No error messages.

If I try to manually set 1 list of parameters, I get the error

"APSwitchRegistry object is not callable". I can paste the entire traceback if you want it.

ChrisWellsWood commented 4 years ago

So this script runs on my computer fine:

import sys 
import isambard.specifications as specifications 
import isambard.modelling as modelling 
import isambard.optimisation 
import budeff 
import isambard.optimisation.evo_optimizers as ev_opts 
from isambard.optimisation.evo_optimizers import Parameter  

def get_buff_total_energy(ampal_object): 
    return budeff.get_internal_energy(ampal_object).total_energy 

class APSwitchRegistry(specifications.CoiledCoil): 
    """ Specification for creating anti-parallel coiled coils with switching registry""" 
    oligomeric_state = 2 
    def __init__(self, helix_length, radius, pitch, phica, ideal_phica): 
       super().__init__(self.oligomeric_state, auto_build=False) 
       self.aas = [helix_length, helix_length] 
       self.major_radii = [radius, radius] 
       self.major_pitches = [pitch, pitch] 
       self.phi_c_alphas = [phica, ideal_phica] 
       self.orientations = [1, -1] 
       self.build() 

ideal_phica_for_register = { 
    "a": 25.714285714285715, 
    "b": 128.57142857142856, 
    "c": 231.42857142857144, 
    "d": 334.2857142857143, 
    "e": 77.14285714285714, 
    "f": 180.0, 
    "g": 282.85714285714283, 
} 

sequences = [ 
    'QAQAQAQAQAQAQAQAQAQAQAQAQAQA', 
    'QAQAQAQAQAQAQAQAQAQAQAQAQAQA' 
] 

for register, ideal_phica in ideal_phica_for_register.items(): 
    parameters = [ 
        Parameter.static('Helix Length', 28), 
        Parameter.dynamic('Radius', 5.0, 1.0), 
        Parameter.dynamic('Pitch', 200, 60), 
        Parameter.dynamic('PhiCA', 27, 27), 
        Parameter.dynamic('Ideal Phica', ideal_phica, 27) 
    ] 
    print('running register ', register) 
    opt_ga = ev_opts.GA(APSwitchRegistry, sequences, parameters, get_buff_total_energy) 
    opt_ga.run_opt(100, 5, cores=20) 

With this output:

running register  a
gen evals   avg         std     min         max     
0   65      -695.141    12.871  -723.587    -671.349
1   64      -705.326    8.59436 -726.132    -689.731
2   51      -711.243    6.94785 -736.257    -699.942
3   63      -714.167    6.4179  -736.257    -705.241
4   56      -717.552    6.25896 -736.257    -708.08 
Evaluated 399 models in total in 0:00:17.517106
Best fitness is (-736.2566512420018,)
Best parameters are [28, 4.3131092346538615, 140.8989789905734, 27.27941350349349, 13.350852008510193]
Warning! Parameter 2 is at or near minimum allowed value

running register  b
/home/cwwoo/anaconda3/lib/python3.7/site-packages/deap/creator.py:141: RuntimeWarning: A class named 'FitnessMin' has already been created and it will be overwritten. Consider deleting previous creation of that class or rename it.
  RuntimeWarning)
/home/cwwoo/anaconda3/lib/python3.7/site-packages/deap/creator.py:141: RuntimeWarning: A class named 'Individual' has already been created and it will be overwritten. Consider deleting previous creation of that class or rename it.
  RuntimeWarning)
gen evals   avg         std     min         max     
0   68      -702.973    15.7467 -750.364    -678.315
1   82      -720.621    11.8513 -750.518    -702.626
2   76      -733.907    9.62738 -763.578    -718.019
3   75      -743.164    8.72045 -769.908    -731.179
4   73      -750.413    6.88411 -769.908    -740.452
Evaluated 474 models in total in 0:00:21.989433
Best fitness is (-769.9078691398342,)
Best parameters are [28, 4.0, 181.05238196671698, 10.77286355185782, 125.7114189831332]
Warning! Parameter 1 is at or near minimum allowed value

running register  c
gen evals   avg         std     min         max     
0   64      -705.937    11.5651 -731.766    -683.959
1   85      -717.975    8.37192 -751.417    -706.155
2   71      -722.726    7.5487  -751.417    -712.361
3   64      -727.107    6.43018 -751.417    -718.238
4   84      -732.262    7.14693 -755.419    -723.625
Evaluated 468 models in total in 0:00:22.710103
Best fitness is (-755.4188036275414,)
Best parameters are [28, 4.320227930262736, 170.06528881514862, 50.111572975869024, 258.42857142857144]
Warning! Parameter 4 is at or near maximum allowed value

running register  d
gen evals   avg         std     min         max     
0   58      -709.428    18.2816 -755.862    -676.871
1   83      -730.335    11.174  -755.862    -712.944
2   75      -741.709    5.84108 -755.862    -732.289
3   71      -747.777    4.24272 -757.362    -740.959
4   68      -751.26     2.97835 -757.432    -745.959
Evaluated 455 models in total in 0:00:21.968733
Best fitness is (-757.4319108976302,)
Best parameters are [28, 4.150284535894523, 218.501544052683, 8.893052751637974, 309.8949200467432]
running register  e
gen evals   avg         std     min         max     
0   77      -701.533    11.9423 -730.596    -679.596
1   65      -710.89     8.18961 -730.596    -697.494
2   71      -715.996    6.86289 -734.446    -704.916
3   59      -718.794    6.11859 -734.446    -709.514
4   66      -722.625    5.96011 -742.995    -714.19 
Evaluated 438 models in total in 0:00:21.979910
Best fitness is (-742.9952199368935,)
Best parameters are [28, 4.323087243404174, 167.1721423993692, 15.553766814588364, 104.14285714285714]
Warning! Parameter 4 is at or near maximum allowed value

running register  f
gen evals   avg         std     min         max     
0   65      -691.741    11.4996 -718.364    -672.784
1   64      -699.772    9.54221 -726.309    -685.48 
2   69      -706.173    7.31704 -726.309    -693.58 
3   79      -711.979    5.76314 -726.707    -703.854
4   76      -716.512    3.47181 -726.707    -709.711
Evaluated 453 models in total in 0:00:23.327486
Best fitness is (-726.7071111644777,)
Best parameters are [28, 4.66617988322632, 199.7194844331818, 44.03413612567408, 177.91889960026288]
running register  g
gen evals   avg         std     min         max    
0   73      -695.841    15.1936 -742.291    -676.04
1   66      -709.883    14.1544 -748.176    -688.963
2   74      -718.858    14.3512 -755.373    -699.715
3   64      -729.238    13.0456 -758.822    -709.076
4   61      -741.643    10.3295 -758.912    -723.535
Evaluated 438 models in total in 0:00:22.549240
Best fitness is (-758.9121563028381,)
Best parameters are [28, 4.187826666609426, 232.97226944290816, 49.47395119652263, 268.3565743938648]

I wonder if it's something to do with your Scwrl setup. Can you build any models, without the optimiser?

What does this output?:

import isambard
import isambard.modelling as modelling                                                                
import isambard.specifications as specifications                                                                

cc = specifications.CoiledCoil(2)
dimer = modelling.pack_side_chains_scwrl(
    cc,
    [specifications.coiledcoil.basis_set_parameters[2]['sequence']]*2
)
print(dimer.sequences)
jmrussell commented 4 years ago

Hi Chris, I fixed the issue by reinstalling Scrwl4, everything worked perfectly after that! Thank you so much for your help

ChrisWellsWood commented 4 years ago

Great, I'm glad it's working. I'm closing this issue now, but if you need anything else, don't hesitate to contact us.