Closed alekhyaa2 closed 9 months ago
This is not an issue with gaps in the PDB reference file.
'15P' is a small molecule ligand in the structure. Updating the positions of the residues in the structure sequence to the uniprot reference positions, lead to an overlap in the small molecule ligand positions (these are not updated since there is no reference for this) and the updated reference positions. So, we end up having multiple residues with the same residue position number. In this case 'K' and '15P' both are given 201 position in the updated adjacency files.
To handle this issue,
Is your feature request related to a problem? Please describe. A gap found in sequence alignment of 4U1P at position 98 (also shown in figure below). This gap is not recorded in the PDB metafile.
Describe alternatives you've considered The PDB structure does not have this gap as observed in our sequence alignment. Found that the Adjacency file for 4U1P misrepresents 'K' as '15P' and doesnt not assign an amino acid letter to it and appends it to the unmodelled list.
Tasks
Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at.