marrink-lab / vermouth-martinize

Describe and apply transformation on molecular structures and topologies
Apache License 2.0
89 stars 40 forks source link

include region #422

Closed xikunliu closed 2 years ago

xikunliu commented 2 years ago

I included a new flag -region which can apply rubber bands only to the atoms within the designated regions. One example can be that I know resid 200-300 and 450-500 in my protein is structured, but the rest residues are not. I can use the following command to apply rubber bands only within resid 200-300 and 450-500 among different chains: martinize2 -f example.pdb -ff martini3001 -dssp -elastic -merge A,B,C,D -scfix -region 200:300,450:500

xikunliu commented 2 years ago

I tested the code on a pdb with multiple chains and found that with region [(200-400)]: A200 - A300: Yes EB A200 - A201: No EB (due to minimum distance in connectivity matrix) A100 - A150: No EB A100 - A250: No EB B200 - B300: No EB ( if the total number of residues in one chain is 400, the resid of B200 will become 600. So if we want to apply rubber bands on A200-A400 as well as B200-B400, we need to use the region flag [(200-400),(600-800)] A200 - B200: No EB A200 - B300: No EB A200 - B201: No EB A200 - B500: No EB

pckroon commented 2 years ago

Hmmn, that's not supposed to happen. The resids of chain B being wrong is probably related to at least one of these: #421 #154 #366. Did you merge the chains for this test?

And could you add some tests for your code, describing the desired behaviour in https://github.com/marrink-lab/vermouth-martinize/blob/master/vermouth/tests/test_apply_rubber_band.py Some of these tests will fail due to the resid renumbering, but that's ok for now. We can mark those as expected failures right before we merge this.

fgrunewald commented 2 years ago

@pckroon looks like a nice addition. However, can you wait with merging until we figured out what happens to the resids, because that will probably cause some stuff in the EN code to be changed. I'm investigating at the moment. Also we need to test this for performance. I only had a quick glance and saw a deep copy as well as some interactions, which for large proteins could run out of hand. The EN code is already quite slow.

pckroon commented 2 years ago

@xikunliu Many thanks! Looks great to me. I'll take it from here :)

@fgrunewald I'm in favour of merging this now (although you can still yell "veto"). This PR does not touch the main EN code, it only adds an additional selection criterion. And the only thing getting deepcopied are cli args, since those will get encapsulated in a function. As for resids, part of the issue here was the use of the -merge flag. Besides, fixing resid behaviour will not affect this code, since the selection done here is simple on purpose. PS. Why is the redistributed kdtree failing?

fgrunewald commented 2 years ago

@pckroon fine by me if you vouch for it. CI must pass fully though

pckroon commented 2 years ago

CI was an unrelated transient failure (hypothesis is too good). So I'll merge this.

fgrunewald commented 2 years ago

@pckroon ok good! Please go ahead with merging then.