geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

Reactome proteoforms that point to deleted UniProt entries #232

Closed deustp01 closed 1 year ago

deustp01 commented 1 year ago

UniProt originally created large numbers of entries for HLA-A, -B, -C, and -DRB1 antigens, reflecting the enormous number of distinct HLA (histocompatibility) antigens. Human genome sequencing revealed that each of these antigen sets is due to an enormous amount of polymorphic variation of a single gene product. As a result, in 2019, UniProt obsoleted all but one each of HLA-A, -B, -C, and -DRB1 instance. Reactome, meanwhile, had separate entries for many of these allele instances, now obsolete, almost all used to create defined_set instances that were then used to annotate reactions involving HLA antigens.

The fix is to obsolete all of these allele instances in Reactome, with a _deletedInstance attribute in each case pointing form the allele instance to the surviving canonical UniProt sequence for HLA-A (P04439), HLA-B (P01889), HLA-C (P10321), and HLA-DRB1 (P01911)

As we do not currently have annotations to show different reactivities of different HLA alleles, e.g., as associated with differential susceptibility to autoimmune diseases, we chose not to annotate any allelic variants by associating geneticallyModifiedResidue instances with canonical forms of the proteins. This can be done in the future if needed.

The cleanup checklist is here in a folder with additional supporting documentation

deustp01 commented 1 year ago

Clean-up is complete, so this issue can be closed. @nataled

nataled commented 1 year ago

Just an FYI: PRO retained all those allelic variants that were deprecated in UniProtKB.

nataled commented 1 year ago

I have verified that, in all cases given in the first sheet of the indicated spreadsheet, the PRO entry for the deprecated UniProtKB entries is given by PR:.

deustp01 commented 1 year ago

This ticket was opened in error, because it duplicates #226 . All new discussion and links from yesterday have been moved to that ticket, so this one can be ignored. Apologies for the confusion.