ANHIG / IMGTHLA

Github for files currently published in the IPD-IMGT/HLA FTP Directory hosted at the European Bioinformatics Institute
http://www.ebi.ac.uk/ipd/imgt/hla/
Other
204 stars 60 forks source link

DRB3*03:57Q-related issue in the DRB_prot.txt file for releases 3.49.0 to 3.51.0 #332

Closed sjmack closed 1 year ago

sjmack commented 1 year ago

The nine nucleotide insertion in DRB3*03:57Q results confusing position numbering in the DRB_prot.txt file. The three residue insertion in this protein results in an indel position in the alignment being identified as position 67, when the position to the immediate right is the actual position 67. This makes it difficult to identify the proper position coordinates for that section of the alignment. I notice a similar issue for the HLA-A and HLA-B protein alignments. The ideal solution would be to start the numbering of positions after any initial indel (".") positions.

DRB_position_67_confusion HLA-A_position_155_confusion HLA-B_position_141_confusion
jrob119 commented 1 year ago

Thanks for the feedback on this, with regards to numbering this is based on only valid bases in the reference sequence, and as such the '.' is not counted as a valid base and the numbering would resume start once the first valid base is encountered.