choderalab / ensembler

Automated omics-scale protein modeling and simulation setup.
http://ensembler.readthedocs.io/
GNU General Public License v2.0
52 stars 21 forks source link

Rosetta loopmodel gets stuck on some models #13

Closed danielparton closed 9 years ago

danielparton commented 9 years ago

Rosetta loopmodel seems to get stuck on the template AURKA_HUMAN_D0_3UOK_A, on the first loop build attempt. Output from a command-line run is below - the program has been stuck following that rmsd message for almost an hour now. I'll look into whether Rosetta has some kind of option for working around this. Otherwise I'll probably see if there is a way to implement a timeout for the subprocess call to loopmodel. Looks like this is possible in the Python 3 version of subprocess, and there is a backport to Python 2.7 here: https://pypi.python.org/pypi/subprocess32

> loopmodel.macosgccrelease -database $MINIROSETTA_DATABASE -in::file::s AURKA_HUMAN_D0_3UOK_A-pdbfixed.pdb -loops:loop_file AURKA_HUMAN_D0_3UOK_A.loop -loops:remodel perturb_kic -loops:refine refine_kic -ex1 -ex2 -nstruct 1 -loops:max_kic_build_attempts 100 -in:file:fullatom

core.init: Rosetta version exported  from http://www.rosettacommons.org
core.init: command: loopmodel.macosgccrelease -database /Users/partond/installers/rosetta/rosetta_2014.35.57232_bundle/main/database/ -in::file::s AURKA_HUMAN_D0_3UOK_A-pdbfixed.pdb -loops:loop_file AURKA_HUMAN_D0_3UOK_A.loop -loops:remodel perturb_kic -loops:refine refine_kic -ex1 -ex2 -nstruct 1 -loops:max_kic_build_attempts 100 -in:file:fullatom
core.init: 'RNG device' seed mode, using '/dev/urandom', seed=548778927 seed_offset=0 real_seed=548778927
core.init.random: RandomGenerator:init: Normal mode, seed=548778927 RG_type=mt19937
protocols.loop_build.LoopBuild: ==== Loop protocol: =================================================
protocols.loop_build.LoopBuild:  remodel        perturb_kic
protocols.loop_build.LoopBuild:  intermedrelax  no
protocols.loop_build.LoopBuild:  refine         refine_kic
protocols.loop_build.LoopBuild:  relax          no
basic.io.database: Database file opened: scoring/score_functions/EnvPairPotential/env_log.txt
basic.io.database: Database file opened: scoring/score_functions/EnvPairPotential/cbeta_den.txt
basic.io.database: Database file opened: scoring/score_functions/EnvPairPotential/pair_log.txt
basic.io.database: Database file opened: scoring/score_functions/EnvPairPotential/cenpack_log.txt
basic.io.database: Database file opened: scoring/score_functions/hbonds/sp2_elec_params/HBPoly1D.csv
basic.io.database: Database file opened: scoring/score_functions/hbonds/sp2_elec_params/HBFadeIntervals.csv
basic.io.database: Database file opened: scoring/score_functions/hbonds/sp2_elec_params/HBEval.csv
basic.io.database: Database file opened: scoring/score_functions/rama/Rama_smooth_dyn.dat_ss_6.4
core.scoring.ScoreFunctionFactory: SCOREFUNCTION: talaris2013
core.scoring.etable: Starting energy table calculation
core.scoring.etable: smooth_etable: changing atr/rep split to bottom of energy well
core.scoring.etable: smooth_etable: spline smoothing lj etables (maxdis = 6)
core.scoring.etable: smooth_etable: spline smoothing solvation etables (max_dis = 6)
core.scoring.etable: Finished calculating energy tables.
basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA
basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA_n
basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA_pp
protocols.jd2.PDBJobInputter: Instantiate PDBJobInputter
protocols.jd2.PDBJobInputter: PDBJobInputter::fill_jobs
protocols.jd2.PDBJobInputter: pushed AURKA_HUMAN_D0_3UOK_A-pdbfixed.pdb nstruct index 1
protocols.evaluation.ChiWellRmsdEvaluatorCreator: Evaluation Creator active ...
protocols.jd2.PDBJobInputter: PDBJobInputter::pose_from_job
protocols.jd2.PDBJobInputter: filling pose from PDB AURKA_HUMAN_D0_3UOK_A-pdbfixed.pdb
core.chemical.ResidueTypeSet: Finished initializing fa_standard residue type set.  Created 737 residue types
core.pack.task: Packer task: initialize from command line()
core.pose.util: Cannot open psipred_ss2 file tt
protocols.loops.loops_main: can not open DSSP file tt
protocols.evaluation.ChiWellRmsdEvaluatorCreator: Evaluation Creator active ...
protocols.loop_build.LoopBuildMover: Annotated sequence of pose: F[PHE:NtermProteinFull]EIGRPLGKGKFGNVYLAREKQSKFILALKVLFKAQLEKAGVEHQLRREVEIQSHLRHPNILRLYGYFHDATRVYLILEYAPLGTVYRELQKLSKFDEQRTATYITELANALSYCHSKRVIHRDIKPENLLLGSAGELKIADFGWSVHAPSSRRTTLCGTLDYLPPEMIEGRMHDEKVDLWSLGVLCYEFLVGKPPFEANTYQETYKRISRVEFTFPDFVTEGARDLISRLLKHNPSQRPMLREVLEHPWI[ILE:CtermProteinFull]
protocols.looprelax: ==== Loop protocol: =================================================
protocols.looprelax:  remodel        perturb_kic
protocols.looprelax:  intermedrelax  no
protocols.looprelax:  refine         refine_kic
protocols.looprelax:  relax          no
protocols.evaluation.ChiWellRmsdEvaluatorCreator: Evaluation Creator active ...
core.chemical.ResidueTypeSet: Finished initializing centroid residue type set.  Created 1010 residue types
protocols.loops.Loop: Autoset cut_ for loop 154 156 as 155.
protocols.looprelax: ====================================================================================
protocols.looprelax: ===
protocols.looprelax: ===   Remodel
protocols.looprelax: ===
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: ALL_LOOPS:LOOP  begin  end  cut  skip_rate  extended
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: LOOP start: 154  stop: 156  cut: 155  size: 3  skip rate: 0  extended?: True
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC:
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC:
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: SELECTEDLOOPS:LOOP  begin  end  cut  skip_rate  extended
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: LOOP start: 154  stop: 156  cut: 155  size: 3  skip rate: 0  extended?: True
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC:
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC:
protocols.loops.loops_main: Pose fold tree FOLD_TREE  EDGE 1 153 -1  EDGE 153 155 -1  EDGE 153 157 1  EDGE 157 156 -1  EDGE 157 251 -1
protocols.loops.loops_main:
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: Setting extended torsions: LOOP start: 154  stop: 156  cut: 155  size: 3  skip rate: 0  extended?: True
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC:
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: Building Loop: LOOP start: 154  stop: 156  cut: 155  size: 3  skip rate: 0  extended?: True
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC:
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: Building Loop attempt: 0
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: perturb_one_loop_with_KIC: 154 3
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: remodel init temp: 2
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: remodel final temp: 1
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: kinematic initial perturb with start_res: 154  middle res: 155  end_res: 156
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: loop rmsd before initial kinematic perturbation:0
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: Attempting loop building: 0 ...
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: initial kinematic perturbation complete
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: loop rmsd after initial kinematic perturbation:1.83156
protocols.moves.MonteCarlo: MonteCarlo:: last_accepted_score,lowest_score: -5.9464 -5.9464
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.84795
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.85105
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.82974
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.86528
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.85976
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.66416
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.72086
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.6695
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.70251
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.65091
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.6348
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.6511
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.63551
protocols.loops.loop_mover.perturb.LoopMover_Perturb_KIC: new centroid perturb rmsd: 1.65253
kyleabeauchamp commented 9 years ago

If we can put some input files on gist, I can take a look on my machine and then try to find an issue tracker for this sort of thing.

danielparton commented 9 years ago

Thanks - I've added the necessary files to this repo: https://github.com/danielparton/loopmodel-debugging

kyleabeauchamp commented 9 years ago

Hmm. It seemed to work on my first attempt (https://gist.github.com/kyleabeauchamp/2b17d9fa7a0baeb204b8). What version of rosetta? I have rosetta_2014.35.57232_bundle

kyleabeauchamp commented 9 years ago

Actually, my run also failed, but it didn't hang--it said that it failed.

kyleabeauchamp commented 9 years ago

Could it be due to a loop that's geometrically impossible? This warning message seems to suggest that the problem is a loop with two residues:

[WARNING] Failed to build loop with kinematic Mover during initial kinematic perturbation after 100 trials: LOOP start: 154  stop: 156  cut: 155  size: 3
kyleabeauchamp commented 9 years ago

I think the problem was that the loop was geometrically impossible given the template constraints--the endpoints were too close together to fit 2 residues between them. If you increase the length of the loop by 1, it works:

LOOP 153 156 - - 1
danielparton commented 9 years ago

Thanks for checking this. My run did eventually complete (unsuccessfully) too - after 18 hours. Looks like yours took about 10 hours? I'm using the same Rosetta version as you.

It's actually just one residue which is being added. The residue span defined in the .loop file includes the two residues either side, and these are allowed to be flexible during the modeling process. Either way, I agree it looks like this is a case where it is geometrically impossible for the loop to be built.

One option would be to have ensembler iteratively increase the residue span (up to a max of 2 or 3 residues) until the loop is built successfully. Or I could set a timeout limit for the call to loopmodel. I'm currently looking at the statistics from the loopmodel runs to see if there might be an obvious choice of timeout limit.

jchodera commented 9 years ago

I like the timeout idea!

Eventually, we will replace this with a more sophisticated template reconstruction method anyway, so let's not spend a huge amount of time finding the absolute best ROSETTA-based reconstruction scheme right now.

danielparton commented 9 years ago

Here is a plot of the timing for each loopmodel run, against the number of missing residues:

nmissing_resis_vs_timing

I think I'll go with a timeout of 3 hours. This should mostly cut off unsuccessful runs.

danielparton commented 9 years ago

I used the subprocess32 package to implement a timeout feature, set to 3 hours by default.

The code falls back to the standard subprocess package (without the timeout feature) if subprocess32 is not available.

subprocess32 is available on PyPI, but not binstar, so I built a conda package and uploaded it to my own binstar channel (https://binstar.org/dannyparton/subprocess32), and then added it as a requirement in the ensembler conda recipe.

jchodera commented 9 years ago

Thanks!

Can you make a PR to include the conda recipe in omnia-md/conda-recipes as well?