OpenFreeEnergy / kartograf

This package contains tools for setting up hybrid-topology FE calculations
https://kartograf.readthedocs.io/
MIT License
25 stars 3 forks source link

Mapping multimer protein components #46

Closed ijpulidos closed 2 days ago

ijpulidos commented 3 months ago

Describe the bug When mapping gufe protein components that were built using multimeric PDBs, I'm observing that the map is only done to a part of the multimer, apparently only one of the monomers is mapped. I would expect kartograf to be able to map the components correctly, or complain if it doesn't.

To Reproduce

from kartograf import KartografAtomMapper
from gufe import ProteinComponent

# Create components from PDB Files
protein_comp = ProteinComponent.from_pdb_file("input.pdb")
mutated_comp = ProteinComponent.from_pdb_file("mutated.pdb")

mapper = KartografAtomMapper(atom_map_hydrogens=True)
mapping = next(mapper.suggest_mappings(protein_comp, mutated_comp))
print(len(mapping.componentA_to_componentB))

It seems to map only the chain "B" for some reason.

Expected behavior I expect the length of the mapping to be the number of atoms of the protein components minus the mutated ones, which should be just a few of them.

Screenshots

image

Additional context This would enable handling protein mutations in a more streamlined way. As far as I can tell, the way to do it right now would be to separate each monomer (each chain in the PDBs) to its own component and then mapping those independently, but that can be cumbersome for users.

PDB files to test in the following zip archive: Archive.zip

IAlibay commented 3 months ago

From today's call: a fix here would be a check for a ProteinComponent that checks for chain breaks and how to fix it.

RiesBen commented 3 months ago

@ijpulidos I marked in the PR the code bits, where I think the new features need to be implemented to :) let me know what you think? :) P.s.: I implemented an initial suggestion for splitting the protein chains into components, can you test that one?

jameseastwood commented 4 weeks ago

Irfan's comments should be addressed, but this PR is not blocking any of Ivan's current work.