Closed biochunan closed 10 months ago
Issue Overview:
The file pdb1qd0_0H.mar
(AbDb version: 20220926) contains chain mapping information that the current parsing logic misinterprets. Specifically, the hapten is being incorrectly processed as if it were a protein chain.
Details: Below is the relevant section from the file that illustrates the chain mapping:
REMARK 950 RESOL crystal, 2.50A/21.00%
REMARK 950 CHAIN-TYPE LABEL ORIGINAL
REMARK 950 CHAIN H H A
SEQRES 1 H 128 GLN VAL GLN LEU GLN GLU SER GLY GLY GLY LEU VAL GLN
SEQRES 2 H 128 ALA GLY GLY SER LEU ARG LEU SER CYS ALA ALA SER GLY
SEQRES 3 H 128 ARG ALA ALA SER GLY HIS GLY HIS TYR GLY MET GLY TRP
SEQRES 4 H 128 PHE ARG GLN VAL PRO GLY LYS GLU ARG GLU PHE VAL ALA
SEQRES 5 H 128 ALA ILE ARG TRP SER GLY LYS GLU THR TRP TYR LYS ASP
SEQRES 6 H 128 SER VAL LYS GLY ARG PHE THR ILE SER ARG ASP ASN ALA
SEQRES 7 H 128 LYS THR THR VAL TYR LEU GLN MET ASN SER LEU LYS GLY
SEQRES 8 H 128 GLU ASP THR ALA VAL TYR TYR CYS ALA ALA ARG PRO VAL
SEQRES 9 H 128 ARG VAL ALA ASP ILE SER LEU PRO VAL GLY PHE ASP TYR
SEQRES 10 H 128 TRP GLY GLN GLY THR GLN VAL THR VAL SER SER
SEQRES 1 A 1 RR6
ATOM 1 N GLN H 1 -18.952 37.800 -10.339 1.00 45.36 N
ATOM 2 CA GLN H 1 -18.028 37.324 -9.266 1.00 42.98 C
ATOM 3 C GLN H 1 -18.639 36.156 -8.495 1.00 41.75 C
ATOM 4 O GLN H 1 -19.862 35.991 -8.475 1.00 41.32 O
The chain of interest, denoted as H
, is followed by the sequence data and the atom coordinates.
Encountered Problem: The error arises when the parser incorrectly treats the hapten indicated in the file as a protein chain, which is not the intended behavior. This issue could lead to inaccurate molecular structure representations and analyses.
Proposed Solution:
Need to refine the parsing logic to accurately distinguish between protein chains and non-protein entities such as haptens - 🤔 try biopandas
.
Implementing a check that differentiates these entities before parsing can prevent such errors.
Additional Information:
Attached is a screenshot of 1qd0H_0H
for visual reference:
Screenshot of 1dq0H_0H
Expected Outcome: Enhance the accuracy of the chain mapping process and ensure that non-protein entities are not misclassified.
Describe the bug
cdrclu
failed on1qd0_0H
To Reproduce Steps to reproduce the behavior:
1qd0_0H
Expected behavior NA
Screenshots NA
Additional context NA