sokrypton / ColabFold

Making Protein folding accessible to all!
MIT License
1.87k stars 474 forks source link

"Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead." error code when ranking predicted structures of dimer #499

Open MSMorrison opened 11 months ago

MSMorrison commented 11 months ago

Expected Behavior

Attempting to predict the structure of a homodimer using colab with stock settings, plus amber relaxation of only the top-ranked final structure.

Current Behavior

After finishing the initial 5 predictions, it attempts to rerank models by 'multimer' metric but then stops and the titled error message appears.

Colabfold Output

2023-09-27 14:49:14,001 reranking models by 'multimer' metric 2023-09-27 14:49:16,964 Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.

Your Environment

Ran via hosted run-time (T4)

Executed code

@title Input protein sequence(s), then hit Runtime -> Run all

from google.colab import files import os import re import hashlib import random

from sys import version_info python_version = f"{version_info.major}.{version_info.minor}"

def addhash(x,y): return x+""+hashlib.sha1(y.encode()).hexdigest()[:5]

query_sequence = 'MSLMVSAGRGLGAVWSPTHVQVTVLQARGLRAKGPGGTSDAYAVIQVGKEKYATSVSERSLGAPVWREEA TFELPSLLSSGPAAAATLQLTVLHRALLGLDKFLGRAEVDLRDLHRDQGRRKTQWYKLKSKPGKKDKERG EIEVDIQFMRNNMTASMFDLSMKDKSRNPFGKLKDKIKGKNKDSGSDTASAIIPSTTPSVDSDDESVVKD KKKKSKIKTLLSKSNLQKTPLSQSMSVLPTSKPEKVLLRPGDFQSQWDEDDNEDESSSASDVMSHKRTAS TDLKQLNQVNFTLPKKEGLSFLGGLRSKNDVLSRSNVCINGNHVYLEQPEAKGEIKDSSPSSSPSPKGFR KKHLFSSTENLAAGSWKEPAEGGGLSSDRQLSESSTKDSLKSMTLPSYRPAPLVSGDLRENMAPANSEAT KEAKESKKPESRRSSLLSLMTGKKDVAKGSEGENPLTVPGREKEGMLMGVKPGEDASGPAEDLVRRSEKD TAAVVSRQGSSLNLFEDVQITEPEAEPESKSEPRPPISSPRAPQTRAVKPRLHPVKPMNAMATKVANCSL GTATIISENLNNEVMMKKYSPSDPAFAYAQLTHDELIQLVLKQKETISKKEFQVRELEDYIDNLLVRVME ETPNILRIPTQVGKKAGKM:MSLMVSAGRGLGAVWSPTHVQVTVLQARGLRAKGPGGTSDAYAVIQVGKEKYATSVSERSLGAPVWREEA TFELPSLLSSGPAAAATLQLTVLHRALLGLDKFLGRAEVDLRDLHRDQGRRKTQWYKLKSKPGKKDKERG EIEVDIQFMRNNMTASMFDLSMKDKSRNPFGKLKDKIKGKNKDSGSDTASAIIPSTTPSVDSDDESVVKD KKKKSKIKTLLSKSNLQKTPLSQSMSVLPTSKPEKVLLRPGDFQSQWDEDDNEDESSSASDVMSHKRTAS TDLKQLNQVNFTLPKKEGLSFLGGLRSKNDVLSRSNVCINGNHVYLEQPEAKGEIKDSSPSSSPSPKGFR KKHLFSSTENLAAGSWKEPAEGGGLSSDRQLSESSTKDSLKSMTLPSYRPAPLVSGDLRENMAPANSEAT KEAKESKKPESRRSSLLSLMTGKKDVAKGSEGENPLTVPGREKEGMLMGVKPGEDASGPAEDLVRRSEKD TAAVVSRQGSSLNLFEDVQITEPEAEPESKSEPRPPISSPRAPQTRAVKPRLHPVKPMNAMATKVANCSL GTATIISENLNNEVMMKKYSPSDPAFAYAQLTHDELIQLVLKQKETISKKEFQVRELEDYIDNLLVRVME ETPNILRIPTQVGKKAGKM' #@param {type:"string"}

@markdown - Use : to specify inter-protein chainbreaks for modeling complexes (supports homo- and hetro-oligomers). For example PI...SK:PI...SK for a homodimer

jobname = 'RCP Homodimer Prediction Attempt 4' #@param {type:"string"}

number of models to use

num_relax = 1 #@param [0, 1, 5] {type:"raw"}

@markdown - specify how many of the top ranked structures to relax using amber

template_mode = "none" #@param ["none", "pdb100","custom"]

@markdown - none = no template information is used. pdb100 = detect templates in pdb100 (see notes). custom - upload and search own templates (PDB or mmCIF format, see notes)

use_amber = num_relax > 0

remove whitespaces

query_sequence = "".join(query_sequence.split())

basejobname = "".join(jobname.split()) basejobname = re.sub(r'\W+', '', basejobname) jobname = add_hash(basejobname, query_sequence)

check if directory with jobname exists

def check(folder): if os.path.exists(folder): return False else: return True if not check(jobname): n = 0 while not check(f"{jobname}{n}"): n += 1 jobname = f"{jobname}{n}"

make directory to save results

os.makedirs(jobname, exist_ok=True)

save queries

queries_path = os.path.join(jobname, f"{jobname}.csv") with open(queries_path, "w") as text_file: text_file.write(f"id,sequence\n{jobname},{query_sequence}")

if template_mode == "pdb100": use_templates = True custom_template_path = None elif template_mode == "custom": custom_template_path = os.path.join(jobname,f"template") os.makedirs(custom_template_path, exist_ok=True) uploaded = files.upload() use_templates = True for fn in uploaded.keys(): os.rename(fn,os.path.join(custom_template_path,fn)) else: custom_template_path = None use_templates = False

print("jobname",jobname) print("sequence",query_sequence) print("length",len(query_sequence.replace(":","")))

doctorbetaq commented 11 months ago

Same error while running multimer prediction. Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.

sokrypton commented 11 months ago

This is just a warning. It should still run.

Or are you not getting results?

On Wed, Sep 27, 2023, 2:35 PM BetaQ @.***> wrote:

Same error while running multimer prediction. Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.

— Reply to this email directly, view it on GitHub https://github.com/sokrypton/ColabFold/issues/499#issuecomment-1737896110, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA76LAVGQ5Y2YLX6CKCOJULX4RWYLANCNFSM6AAAAAA5JS4EXI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

MSMorrison commented 11 months ago

This is just a warning. It should still run. Or are you not getting results?

It stops running after that point. I've tried it twice and given it 2-3 hours at that point but it doesn't seem to progress.

MSMorrison commented 11 months ago

So it does actually attempt to run it seems, it claims I keyboard interrupted at the end but I'm fairly sure I didn't

code after that point

KeyboardInterrupt Traceback (most recent call last) in <cell line: 67>() 65 66 download_alphafold_params(model_type, Path(".")) ---> 67 results = run( 68 queries=queries, 69 result_dir=result_dir,

8 frames /content/colabfold/batch.py in run(queries, result_dir, num_models, is_complex, num_recycles, recycle_early_stop_tolerance, model_order, num_ensemble, model_type, msa_mode, use_templates, custom_template_path, num_relax, keep_existing_results, rank_by, pair_mode, pairing_strategy, data_dir, host_url, user_agent, random_seed, num_seeds, recompile_padding, zip_results, prediction_callback, save_single_representations, save_pair_representations, save_all, save_recycles, use_dropout, use_gpu_relax, stop_at_score, dpi, max_seq, max_extra_seq, use_cluster_profile, feature_dict_callback, **kwargs) 1471 first_job = False 1472 -> 1473 results = predict_structure( 1474 prefix=jobname, 1475 result_dir=result_dir,

/content/colabfold/batch.py in predict_structure(prefix, result_dir, feature_dict, is_complex, use_templates, sequences_lengths, pad_len, model_type, model_runner_and_params, num_relax, rank_by, random_seed, num_seeds, stop_at_score, prediction_callback, use_gpu_relax, save_all, save_single_representations, save_pair_representations, save_recycles) 527 if n < num_relax: 528 start = time.time() --> 529 pdb_lines = relax_me(pdb_lines=unrelaxed_pdb_lines[key], use_gpu=use_gpu_relax) 530 files.get("relaxed","pdb").write_text(pdb_lines) 531 logger.info(f"Relaxation took {(time.time() - start):.1f}s")

/content/colabfold/batch.py in relax_me(pdb_filename, pdb_lines, pdb_obj, use_gpu) 313 use_gpu=use_gpu) 314 --> 315 relaxed_pdblines, , _ = amber_relaxer.process(prot=pdb_obj) 316 return relaxed_pdb_lines 317

/content/alphafold/relax/relax.py in process(self, prot) 60 ) -> Tuple[str, Dict[str, Any], Sequence[float]]: 61 """Runs Amber relax on a prediction, adds hydrogens, returns PDB string.""" ---> 62 out = amber_minimize.run_pipeline( 63 prot=prot, max_iterations=self._max_iterations, 64 tolerance=self._tolerance, stiffness=self._stiffness,

/content/alphafold/relax/amber_minimize.py in run_pipeline(prot, stiffness, use_gpu, max_outer_iterations, place_hydrogens_every_iteration, max_iterations, tolerance, restraint_set, max_attempts, checks, exclude_residues) 474 475 while violations > 0 and iteration < max_outer_iterations: --> 476 ret = _run_one_iteration( 477 pdb_string=pdb_string, 478 exclude_residues=exclude_residues,

/content/alphafold/relax/amber_minimize.py in _run_one_iteration(pdb_string, max_iterations, tolerance, stiffness, restraint_set, max_attempts, use_gpu, exclude_residues) 408 logging.info("Minimizing protein, attempt %d of %d.", 409 attempts, max_attempts) --> 410 ret = _openmm_minimize( 411 pdb_string, max_iterations=max_iterations, 412 tolerance=tolerance, stiffness=stiffness,

/content/alphafold/relax/amber_minimize.py in _openmm_minimize(pdb_str, max_iterations, tolerance, stiffness, restraint_set, exclude_residues, use_gpu) 102 ret["einit"] = state.getPotentialEnergy().value_in_unit(ENERGY) 103 ret["posinit"] = state.getPositions(asNumpy=True).value_in_unit(LENGTH) --> 104 simulation.minimizeEnergy(maxIterations=max_iterations, 105 tolerance=tolerance) 106 state = simulation.context.getState(getEnergy=True, getPositions=True)

/usr/local/lib/python3.10/site-packages/openmm/app/simulation.py in minimizeEnergy(self, tolerance, maxIterations) 135 to how many iterations it takes. 136 """ --> 137 mm.LocalEnergyMinimizer.minimize(self.context, tolerance, maxIterations) 138 139 def step(self, steps):

/usr/local/lib/python3.10/site-packages/openmm/openmm.py in minimize(context, tolerance, maxIterations) 11148 the maximum number of iterations to perform. If this is 0, minimation is continued until the results converge without regard to how many iterations it takes. The default value is 0. 11149 """

11150 return _openmm.LocalEnergyMinimizer_minimize(context, tolerance, maxIterations) 11151 __swig_destroy__ = _openmm.delete_LocalEnergyMinimizer 11152

KeyboardInterrupt: