We currently only integrate UniProt proteins that have a reference proteome ID. Do we also want to integrate the 195 UniProt reviewed entries that are not part of the species reference proteome (entries without a reference proteome ID)?
Pedro at UniProt is currently looking at why the entries are not part of the reference proteome so it may be that at least some of entries will be assigned a reference proteome ID. However, most of them may never have a reference proteome ID for different reasons, including that they are assigned to a non-reference proteome, they have only been sequenced at the protein level, and they cannot be mapped to the proteome/ proteome component.
The issue is that there may be many interesting proteins that are not part of the reference proteome including immunoglobins, 30 have a glycoprotein keyword, and 12 have a glycosidase keyword
We currently only integrate UniProt proteins that have a reference proteome ID. Do we also want to integrate the 195 UniProt reviewed entries that are not part of the species reference proteome (entries without a reference proteome ID)?
Pedro at UniProt is currently looking at why the entries are not part of the reference proteome so it may be that at least some of entries will be assigned a reference proteome ID. However, most of them may never have a reference proteome ID for different reasons, including that they are assigned to a non-reference proteome, they have only been sequenced at the protein level, and they cannot be mapped to the proteome/ proteome component.
The issue is that there may be many interesting proteins that are not part of the reference proteome including immunoglobins, 30 have a glycoprotein keyword, and 12 have a glycosidase keyword