Closed deustp01 closed 1 year ago
Clean-up is complete, so this issue can be closed. @nataled
Just an FYI: PRO retained all those allelic variants that were deprecated in UniProtKB.
I have verified that, in all cases given in the first sheet of the indicated spreadsheet, the PRO entry for the deprecated UniProtKB entries is given by PR:
This ticket was opened in error, because it duplicates #226 . All new discussion and links from yesterday have been moved to that ticket, so this one can be ignored. Apologies for the confusion.
UniProt originally created large numbers of entries for HLA-A, -B, -C, and -DRB1 antigens, reflecting the enormous number of distinct HLA (histocompatibility) antigens. Human genome sequencing revealed that each of these antigen sets is due to an enormous amount of polymorphic variation of a single gene product. As a result, in 2019, UniProt obsoleted all but one each of HLA-A, -B, -C, and -DRB1 instance. Reactome, meanwhile, had separate entries for many of these allele instances, now obsolete, almost all used to create defined_set instances that were then used to annotate reactions involving HLA antigens.
The fix is to obsolete all of these allele instances in Reactome, with a _deletedInstance attribute in each case pointing form the allele instance to the surviving canonical UniProt sequence for HLA-A (P04439), HLA-B (P01889), HLA-C (P10321), and HLA-DRB1 (P01911)
As we do not currently have annotations to show different reactivities of different HLA alleles, e.g., as associated with differential susceptibility to autoimmune diseases, we chose not to annotate any allelic variants by associating geneticallyModifiedResidue instances with canonical forms of the proteins. This can be done in the future if needed.
The cleanup checklist is here in a folder with additional supporting documentation