marrink-lab / vermouth-martinize

Describe and apply transformation on molecular structures and topologies
Apache License 2.0
95 stars 43 forks source link

Cysteine #402

Closed Tsjerk closed 2 years ago

Tsjerk commented 2 years ago

This PR fixes the issue where a cysteine bridge is to be added between chains, where the cysteines have the same residue id. It was tested on 3i40.pdb. Maybe a good idea to add that as test case.

fgrunewald commented 2 years ago

I've had a closer look at the PR. At the moment 3 integration tests fail. Reasons I found so far:

@pckroon do you have any idea how this messes with features? The problem here seems that cysteine bridges are not added when the resids and all other attributes are the same except for the chain identifiers. The reason is that match_order makes sure the resids cannot be equal.

Here are some more general thoughts:

pckroon commented 2 years ago

The main issue I see is that the whole "order" attribute is a bit of a hack. When we implemented it we only ever considered the resid to the the relevant attribute, but this is (obviously, in hindsight) not true. This PR is also a patch on top of the already festering implementation (but it works! For now). The original intention for order=* is/was to mean "different residues". This got, in this case, implemented as "different resid". This PR brings it more in line with the original intention. This is also how it is intended currently in martini22p, so no issues there.

~With this PR, for order = "*", now both resid and chain id will be taken into account. However, I can think of cases where that still breaks (2 chains without chain id, even though we know they're different chains). Is that a bad thing? Maybe. Is it worth fixing now? Probably not.~ See also #57.

pckroon commented 2 years ago

We should not use a filtering of interactions in DoLinks, in general add_or_replace_interaction takes care of this even though it's super slow. We could also have a temporary interactions dict like in polyply where keys are simply frozensets. That works well and gives quite some speed-up.

This is a good idea. It makes much more sense for add_or_replace_interaction to know which interactions are symmetric (or not). The only issue/problem is that links can do more than add interactions (remove interactions and nodes, for example), so we can't remove /all/ the symmetry filtering from DoLinks, but a middle ground is possible. For example, if a link only adds interactions don't filter anything, and otherwise decide on whether it's symmetric or not based on more complex criteria. Does require some careful thought. What if Links add multiple interactions (which don't need to overlap)?

Tsjerk commented 2 years ago

I'm still missing a testcase that would fail on the current code, but will pass with this PR. LGTM otherwise!

(('', ''), (6, 6), ('A', 'B'), True) was added as per suggestion in commit 295596a419760e59f5085e98be93e3d72647e025.

fgrunewald commented 2 years ago

@pckroon this needs mostly tests and otherwise we're happy with it?

pckroon commented 2 years ago

As far as I'm concerned tests are a "nice to have" in this case (but shouldn't be too hard. I think it's mostly a few edge cases). I'm blocking this on the docs not building though, and for some reason this PR is based on #401 rather than master. So IMHO this needs to be rebased.

fgrunewald commented 2 years ago

This PR will be replaced by #444.