openmm / pdbfixer

PDBFixer fixes problems in PDB files
Other
453 stars 114 forks source link

Python API: Keep chain #203

Open mirix opened 4 years ago

mirix commented 4 years ago

Hello,

In the Python API I see that there is a removeChains function. However, what I know is the chain ID of the chain I want to keep. What would be the simplest way to keep the interesting monomer without having to retrieve information about the rest?

Best regards,

Miro

peastman commented 4 years ago

If you want to remove all chains except i you can do it like this.

chains = list(range(fixer.topology.getNumChains())
chains.remove(i)
fixer.removeChains(chains)
mirix commented 4 years ago

Thank you. However, that solution still requires one to map chain indices and IDs. It is not a big deal. But the information you typically have is that chain C is corresponds to such domain of such gene product. In most instances, but not in all, the index of chain C is going to be 2 because it is the third one that the program is going to find. While this is easy to verify, in my humble opinion, it is an unnecessary workaround.

peastman commented 4 years ago

You could do this:

chains = [i for i, c in enumerate(fixer.topology.chains()) if c.id != 'C']
fixer.removeChains(chains)

But be aware that chain IDs often are not unique. A file could easily have two different chains that are both called "chain C".

mirix commented 4 years ago

Thanks! Yes, I am aware. Back in the day I wrote a shell script to deal with exceptions on PDB files. It is a thousand lines long and still does not catch all of the exceptions. It is getting there but it is too inefficient for hight throughput.