marrink-lab / vermouth-martinize

Describe and apply transformation on molecular structures and topologies
Apache License 2.0
95 stars 43 forks source link

Enhancement: Use MDTraj DSSP #522

Closed KasperBuskPedersen closed 10 months ago

KasperBuskPedersen commented 1 year ago

Hi

A suggestion would be to use the DSSP algorithm implemented in e.g. MDTraj isntead of a custom DSSP code on the users workstation. This would increase consistency across scientific studies using martinize2 since a correct mdtraj version could be required in pip for martinize2.

#!/usr/bin/env python
import sys
import numpy as np
import mdtraj as md

structure=sys.argv[1]
s = md.load(structure)
dssp = md.compute_dssp(s)[0].tolist()

out = "".join(dssp)
f = open("{}.dssp".format(structure), "w")
f.write(out)
f.close()

The sequence can then be used as input for martinize2 or this could happen behind the scenes.

Cheers Kasper

pckroon commented 1 year ago

Hey,

I like this idea! DSSP has been a thorn for a while now. Since MDTraj is a compiled dependency I don't want it as a hard dependency though. That said, how about we try to import compute_dssp from mdtraj, and if we can we use that. If not we fall back to the current implementation. What do you think?

PR would be much appreciated :)

KasperBuskPedersen commented 1 year ago

I think you can safely do that. I have been using the compute_dssp function for 5 years and it has been static and not changed behavior.

Probably less error prone if you implement it yourself :) I didnt have too much time to look into the martinize2 source. Maybe when i start a post doc in jan 2024.

Cheers Kasper

pckroon commented 1 year ago

I won't have time to get into this soon I'm afraid, especially since it's no longer my job to work on this :(

KasperBuskPedersen commented 1 year ago

By the way, remember the simplified=False dssp = md.compute_dssp(s,simplified=False)[0].tolist()

Then '' (Loops and irregular elements) needs to be renamed to 'C'

KasperBuskPedersen commented 1 year ago

Corrected script:

!/usr/bin/env python

import sys import numpy as np import mdtraj as md

structure=sys.argv[1] s = md.load(structure)

dssp = md.compute_dssp(s,simplified=False) dssp_C = ['C' if l==' ' else l for l in dssp[0].tolist()] out = "".join(dssp_C) f = open("{}.dssp".format(structure), "w") f.write(out) f.close()