Open 0ut0fcontrol opened 7 months ago
I want to use label_seq_id and bond order from mmcif.
replacing res_id
with auth_seq_id
in get_structure()
, make bonds right when set use_author_fields=False
.
It is ugly 😂.
Better way to handle this?
Hi, thanks for the report. I dived a bit into this problem and my conclusion is that if use_author_fields=False
both NAGs are considered as the same residue: The start of a new residue is automatically detected when one of chain_id
, res_id
, ins_code
or res_name
changes. As you see in the CIF nothing of these points are true in this edge case. Hence, both NAGS are considered the same residue. As a consequence, only the first occurrence of each atom is connected to its corresponding bond partner.
I would consider this a bug in get_residue_starts()
. However, this one is rather cursed to fix: We could add an additional condition, which checks if the same atom name appears again in a residue. However, this check would be rather computationally expensive, so I assume it would slow down a number of functionalities in Biotite, such as adding bonds to structures.
For now a solution for your concrete problem would be to use author_fields=True
, and add the label_*
fields of interest to extra_fields
. If you want you can later overwrite the annotation arrays with the label_*
ones, e.g.
atoms.chain_id = atoms.label_asym_id
Your solution appears to be cleaner (with no biotite source code alterations). Thank you!
How to get bonds right when use_author_fields=False?