Processed Receptor has Long Legs in PDB File

syedzayyan commented 11 months ago

Hello!

I am trying PDBFixer to clean up a PDB file (6D9H) and when I remove the complexed proteins from the receptor, the resulting receptor file is left with long bits (that I am calling legs) sticking out. I tried OpenMM Setup as well. This is the file from openmm-setup.

7ld4-processed.pdb.zip

Any idea how to fix this issue?

This is the code, which is very similar to openmm-setup that I am trying:

chains_to_delete = []
hasHeterogen = False
for chain in fixer.topology.chains():
    if chain.id != 'R':
        chains_to_delete.append(chain.index)
fixer.removeChains(chainIndices=chains_to_delete)

fixer.addMissingAtoms()
fixer.addMissingHydrogens(7.0)
fixer.removeHeterogens()

peastman commented 11 months ago

Many proteins have flexible tails at the ends. Because they're flexible, they generally can't be resolved by crystallography, so they're missing from crystal structures. PDBFixer adds them sticking outward in a straight line, just because that's an easy way to do it, but don't think it means anything. The whole point is that they're flexible. They don't have a fixed conformation.

In some proteins the tails are functionally important. In others they can be safely removed, and it won't affect the function of the protein.

syedzayyan commented 11 months ago

That's what I guessed. For a hassle-free clean separation of pdb file into chains, I found this tool by Bonvin Lab called pdb-tools, in case someone runs into the issue again.

Thank you for the comments and help!

groponp commented 10 months ago

Dear in case that i don't like c and n terminal residues, there is any example?

Using example from you docs, say me missingResidues.keys not exist

peastman commented 10 months ago

The manual gives an example of adding residues only in the middle of chains but not at the ends.

fixer.findMissingResidues()
chains = list(fixer.topology.chains())
keys = fixer.missingResidues.keys()
for key in keys:
    chain = chains[key[0]]
    if key[1] == 0 or key[1] == len(list(chain.residues())):
        del fixer.missingResidues[key]
fixer.findNonstandardResidues()

groponp commented 10 months ago

It is my code: PDB used 5x00

    def missing(self, ifile, keep_chain="all"): 
        #self.download_pdb(PDBcode)
        PDBname = ifile
        IO().message("Adding missing atoms to the file {}".format(PDBname), tm="INFO")
        fixer = PDBFixer(filename=PDBname)
        fixer.findMissingResidues()
        chains = list(fixer.topology.chains())
        keys = fixer.missingResidues.keys()
        for key in keys:
            chain = chains[key[0]]
            if key[1] == 0 or key[1] == len(list(chain.residues())):
                del fixer.missingResidues[key]
        fixer.findNonstandardResidues()
        fixer.replaceNonstandardResidues()
        fixer.removeHeterogens(True)
        fixer.findMissingAtoms()
        fixer.addMissingAtoms()]

and ERROR IS:

File "/Volumes/Galvani/Scripts/easyHTMD/test/../easyHTBMD.py", line 73, in fixername = PDB().missing(ifile=filename, keep_chain="all") File "/Volumes/Galvani/Scripts/easyHTMD/src/PDB.py", line 84, in missing for key in keys: RuntimeError: dictionary changed size during iteration

peastman commented 10 months ago

Try changing it to

keys = list(fixer.missingResidues.keys())

groponp commented 10 months ago

Thanks it help so much. A more question as evite add N an C terminal?, only using it is suitable?

peastman commented 10 months ago

Sorry, I don't understand your question?

groponp commented 10 months ago

"I would like to know, on the one hand, if using this code, it is safe to prevent the N- and C-terminal regions from being added, for example: if one has a protein of 100 amino acids and only the region from 50 to 90 was resolved by X-ray, does this code prevent residues from 1 to 59 and from 91 to 100 from being added? On the other hand, I would like to know what is the output nomenclature of the PDB. Amber ff, CHARMM ff?

peastman commented 10 months ago

Correct. The terminal regions will not be added.

It uses standard PDB nomenclature, as specified in the PDB Chemical Components Dictionary. The output should fully comply with the PDB spec.

groponp commented 9 months ago

Dear when using fixer.addMissingHydrogens(7.0), as i can specify charmm ff nomenclature for output?

peastman commented 9 months ago

No, it always produces standard PDB files. CHARMM has a format that it calls "PDB", but it isn't really. It's a proprietary format used only by CHARMM. It differs from the real PDB format in several ways.

groponp commented 9 months ago

Is this method correct for assigning protonation states to residues? For example, I am using NAMD to read the fixed PDB, so when it reads the PDB with psfgen, could it differentiate, for instance, between HIP (+1 charged, both δ- and ε-nitrogens protonated), HID (neutral, δ-nitrogen protonated), and HIE (neutral, ε-nitrogen protonated)? So that I can obtain this in the final result? Or do I have to do this manually? Because when using PDB2PQR, it generates incorrect results.

peastman commented 9 months ago

You're asking questions that are unrelated to this issue. Could you open a new issue instead of tacking them on to this one? Thanks!

openmm / pdbfixer

Processed Receptor has Long Legs in PDB File #276