NaegleLab / CoDIAC

GNU General Public License v3.0
0 stars 0 forks source link

Unmodeled regions returned as domains in PDB reference csv file #49

Open alekhyaa2 opened 2 months ago

alekhyaa2 commented 2 months ago

Is your feature request related to a problem? Please describe. Some of the PIK3R1 PDB structures are in complex with PIK3CA. PIK3CA and PIK3R1 are chains A and B respectively. PIK3R1_B is crystalized from 322-600 but 322-441 is an unmodeled region. This is sequence range also encodes a SH2 domain. In our PDB reference file, the SH2 domain is encoded because it lies within the crystalized range of the structure but since it is unmodeled the structure does not actually have the domain within it. So, while trying to look at RMSDs between SH2 domains, I found that the SH2 domain is missing from these PIK3R complexes. (PDB IDs: 2RD0, 5M6U, 4A55)

Below is the image of PIK3R1 chain B of 2RD0 PDB structure (which are two helices).

Screenshot 2024-04-16 at 11 44 18 PM

Below is a 5SXF chain B that has two helices and partially modeled SH2 domain.

Screenshot 2024-04-17 at 12 02 06 AM

Tasks

Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at.

alekhyaa2 commented 2 months ago
Screenshot 2024-04-17 at 12 12 24 AM

Here is a screenshot of the PDB entries

knaegle commented 1 month ago

Update on conversation - we would like this to come directly from PDB API and not an extraction from CIF files. Web interface suggests unmodeled regions can be pulled from API.