Open ReneRanzinger opened 1 month ago
@katewarner and @jeet-vora can you please look into this, details given below:
Protein accession history is based on reviewed/protein_uniprotkb_accession_history.csv dataset which is created using downloads/ebi/current/accession-history-homo-sapiens.tsv
The download file says "Q96BV4" is replaced by "P99999" but there is no such history in uniprot pages
$ grep P99999 downloads/ebi/current/accession-history-homo-sapiens.tsv
old_accession current_accession
A4D166, B2R4I1, P00001, Q6NUR2, Q6NX69, Q96BV4 P99999
According to the UniProt protein text file "Q96BV4" has been merged into "P99999" which is the canonical accession, so the error message in GlyGen is correct.
https://[rest.uniprot.org/uniprotkb/P99999.txt](https://rest.uniprot.org/uniprotkb/P99999.txt)
Raja recently proposed that any search for a secondary accession in GlyGen should re-direct the user to the canonical protein entry, so I'm currently looking into this. I would like to clarify with Jie what "old accession" means in the accession history doc because to me it looks like "old accession" in the history doc means secondary accession in UniProt so we could use this file to do the mapping for this, and perhaps to update the history section to show that it's been merged into P99999.
@katewarner its not about the error message for Q96BV4 (https://www.glygen.org/protein/Q96BV4). That is fixed now.
But if you go to https://www.glygen.org/protein/P99999-1#History . It only shows that it was added at a certain time. But as far as I remember it should also have statements for merged, renamed etc.
@sujeetvkulkarni can you confirm this?
@katewarner this is not done yet. But should we update the history section to include merged/renamed entries as well.
Related to #1776.
Protein ID: Q96BV4 (https://www.glygen.org/protein/Q96BV4)
redirects to P99999-1.
But the history section on this page does not have any evidence for this. Shouldnt it show that it was merged or rename?