Is your feature request related to a problem? Please describe.
Inconsistent lengths of SH2 domain fasta sequences that are generated across available PDB structures for genes ABL1 and HCK.
For ABL1 - The "CANNONICAL_SEQ_BEG_POSITION" does not match the start position in the "reference range" column (PDB_reference metafile) and this leads to shifted start positions of the sequences. The attached snapshot shows the difference in start positions for PDBs - 1OPL, 6AMV, 6AMW.
The other issue for the sequence difference comes from the SH2 domain boundary. The structures with errors (1OPL, 6AMV, 6AMW) have a different SH2 domain boundary (123-215) and the other ABL1 structures run from (125-217).
For HCK - Though the "reference ranges" start position matches the "CANNONICAL_SEQ_BEG_POSITION", there is still a shift by one amino acid for PDB structures (2HCK, 1AD5, 1QCF).
Describe alternatives you've considered
Tried manually changing (for ABL1 structures) the start position in the "reference range" column but by doing so this does not allow the contactmap class to assign a refseq since the lengths of structseq and refseq are not equal.
Tasks
Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at.
[x] Check whether the gaps are mapped correctly in integrateStructure_Reference.py
[ ] Resolve the sequence length issue for HCK structures (unable to identify why there is a difference)
Is your feature request related to a problem? Please describe. Inconsistent lengths of SH2 domain fasta sequences that are generated across available PDB structures for genes ABL1 and HCK.
Describe alternatives you've considered Tried manually changing (for ABL1 structures) the start position in the "reference range" column but by doing so this does not allow the contactmap class to assign a refseq since the lengths of structseq and refseq are not equal.
Tasks
Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at.