Closed joelbard closed 3 years ago
Thanks for the suggestion @joelbard , could you provide us with a simple example PDB that we can test on?
Also, could you check if running pdb_tidy
first fixes this issue, e.g. pdb_tidy your.pdb | pdb_delinsertion
?
The below seems to fix the problem
offset = 0
prev_resi = None
seen_ids = set()
clean_icode = False
curChain = ''
records = ('ATOM', 'HETATM', 'ANISOU', 'TER')
for line in fhandle:
if line.startswith(records):
res_uid = line[17:27] # resname, chain, resid, icode
id_res = line[21] + line[22:26].strip() # A99, B12
chain = line[21]
if chain != curChain:
curChain = chain
offset = 0
has_icode = line[26].strip() # ignore ' ' here
didn't format that well...it's all code...
Hi @joelbard , did you try running pdb_tidy your.pdb | pdb_delinsertion
to see if it fixes the issue?
I just tried running the file through pdb_tidy. It does fix the problem with pdb_delinsertion incrementing the residue numbers of subsequent chains. It has the side-effect of adding TER cards every time there is a gap in the residue numbering which I don't think is in keeping with the definition of TER in the pdb format definition. My understanding is that TER is meant to be used only at the true carboxy terminus of the chain and not at spots where residues present in the SEQRES are omitted from the model due to missing density. In my case there is also a deletion in the construct used for crystallography so the residue numbering jumps to maintain consistency with canonical numbering of the molecule. This leads to a covalent connection between residues with discontinuous numbering. I would certainly not want a TER card between two bonded residues.
Glad it sorted it out. I'd rather keep it like this (tidy + delinsertion) than adding more functionality to delinsertion.
Thanks for raising the issue with the TER
statements; all valid points. We used TER
s to separate discontinuous regions because that's what some (old and new) programs use to signal a chain break. I'll look into changing this so that it only affects true chain endings.
@joelbard we just pushed a change to pdb_tidy
that adds an option not to add TER
records on chain breaks. You can try it with pdb_tidy -strict 1abc.pdb
. Make sure to use the latest version of pdb-tools: pip install --upgrade pdb-tools
.
Thanks for raising this issue! I'll close it but feel free to re-open if you think we should make more changes!
The pdb-deinsert tool works nicely but it seems that it affects chains beyond the one with the insertion code. I have an antibody antigen complex where I'm trying to remove the insertion codes for the antibody. The antibody chains are A and B. The antigen sequence numbering starts at C 1391. After removing insertion codes the antigen now starts at C 1396. The antigen should be unaffected by this operation. Thanks for a very handy tool....