openmm / pdbfixer

PDBFixer fixes problems in PDB files
Other
453 stars 114 forks source link

Maintain wanted heterogens #206

Open ghost opened 4 years ago

ghost commented 4 years ago

Hello! Is it possible to select heterogens to keep with the Python API? Or is it only possible to keep all, to keep just waters and to remove all? Thank you!

jchodera commented 4 years ago

Currently, it isn't possible to specify which heterogens to keep, but this sounds like a good feature request!

How would you want to specify them? By residue name, residue/chain, or some other way?

peastman commented 4 years ago

It's actually pretty easy. removeHeterogens() is just a very simple wrapper around a call to Modeller.delete(). Here is the complete implementation of it:

        keep = set(proteinResidues).union(dnaResidues).union(rnaResidues)
        keep.add('N')
        keep.add('UNK')
        if keepWater:
            keep.add('HOH')
        toDelete = []
        for residue in self.topology.residues():
            if residue.name not in keep:
                toDelete.append(residue)
        modeller = app.Modeller(self.topology, self.positions)
        modeller.delete(toDelete)
        self.topology = modeller.topology
        self.positions = modeller.positions

If you want to delete a different set of residues, you can write similar code with your own logic for selecting them.

ghost commented 4 years ago

Thank you!

Then I just did the small change below to remove based on the heterogen PDB code


def removeHeterogens(self, keepWater=True, keepHet=[]):

        keep = set(proteinResidues).union(dnaResidues).union(rnaResidues)
        keep.add('N')
        keep.add('UNK')
        if keepWater:
            keep.add('HOH')
        for Het in keepHet:
            keep.add(Het)
        toDelete = []
        for residue in self.topology.residues():
            if residue.name not in keep:
                toDelete.append(residue)
        modeller = app.Modeller(self.topology, self.positions)
        modeller.delete(toDelete)
        self.topology = modeller.topology
        self.positions = modeller.positions
ghost commented 4 years ago

@jchodera if you do add the feature, for my user case scenarios I think it would be useful to have different ways to specify the heterogens you want to maintain: by residue name, by residue plus chain and by residue number.