NaegleLab / CoDIAC

Other
0 stars 0 forks source link

Integrate structure reference misnames domains in multidomain proteins #54

Closed knaegle closed 4 months ago

knaegle commented 4 months ago

Description

As a result of the data type (dictionary) used to keep track of domains that were matched, the output on the integration of domains from the uniprot reference means the domain name is incorrect. Also, this isn't robust for more than 2 domains of the same type in a structure.

Screenshots

Downhill Windmills

Files

PLCG1 structure 4EY0 gets an SH2 and an SH2_2 name, despite being the same InterPro ID.

Expected behavior

The expected behavior is for the structure appended domains to have the exact same form as the Uniprot reference domain names.

Tasks

Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at, if known