geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

Reactome: question about GO cellular component for GPI-anchored proteins #150

Closed nataled closed 2 years ago

nataled commented 2 years ago

In checking the modifications for GPI-anchored proteins (GPI-APs), I came across the following types of cases with respect to localization.

75 instances - GO:0005886 plasma membrane 66 instances - GO:0005576 extracellular region (these are all GPI-APs that have been shed) 16 instances - other

The plasma membrane and extracellular region cases are those that I expect are final(ish) locations. The others, I assume, are intermediates. This in itself is not an issue, but when I see differences it raises the question of whether or not such differences reflect actual biology vs curator choice (or the related, standard operating procedure at the time of curation). On the chance these differences are not based on biology I thought it appropriate to bring these to your attention. The outcome, by the way, will have no impact on the export to PRO.

The Golgi membrane and endoplasmic reticulum cases are, at least to me, uncontroversial. The others fall into two issues: (1) the 'transport vesicle' cases...perhaps should be GO:0030658 "transport vesicle membrane"; and (2) inconsistency in how the molecule is transported. Again, I'm not sure if those inconsistencies reflect the biology, or if there is another reason. I bring this to your attention in case it's the latter and you want to standardize.


R-HSA-6808768   GO:0000139   Golgi membrane
R-HSA-6808777   GO:0000139   Golgi membrane
R-HSA-5689453   GO:0005789   endoplasmic reticulum membrane
R-HSA-162687    GO:0005789   endoplasmic reticulum membrane
R-HSA-5689457   GO:0005789   endoplasmic reticulum membrane
R-HSA-5689806   GO:0012507   ER to Golgi transport vesicle membrane
R-HSA-5689801   GO:0012507   ER to Golgi transport vesicle membrane
R-HSA-2201284   GO:0010008   endosome membrane
R-HSA-6803315   GO:0030667   secretory granule membrane
R-HSA-5689808   GO:0033116   endoplasmic reticulum-Golgi intermediate compartment membrane
R-HSA-5689804   GO:0033116   endoplasmic reticulum-Golgi intermediate compartment membrane
R-HSA-6807756   GO:0033116   endoplasmic reticulum-Golgi intermediate compartment membrane
R-HSA-6808775   GO:0030133   transport vesicle
R-HSA-6808779   GO:0030133   transport vesicle
R-HSA-6808771   GO:0030133   transport vesicle```
deustp01 commented 2 years ago

We are trying to capture steps in a translocation process in which a modified protein or complex moves from one vesicle membrane environment to the next to the next as it matures and traverses a cell, so multiple locations, including some that are transient, are OK. But, when all molecules in a particular location should be tagged with the same GO cell_component to show this, and these location tags should be specific. Thus, "transport vesicle" seems wrong unless there are no experimental data to allow us to assign a more specific location. But if that's the case why are we placing the entity in a finely detailed transport pathway?

Meanwhile, as you say, uncertainties about exactly where an entity is have no effect on the process for generating correct PRO identifiers (we positively want these to be location-agnostic), so I think it is safe to close this ticket while curators investigate and improve the compartment assignments for these entities. And it's worth noting that our QA does not pick up this sort of inconsistent usage, so if it falls out easily from the PRO ID project, that is a bonus for us - thanks!

nataled commented 2 years ago

Alas, I don't typically perform any checks for location inconsistencies, precisely for the reason stated--it's immaterial to the PRO import. These came onto my radar for other reasons, and because there were only a few to look at, the location inconsistency just jumped out at me.

deustp01 commented 2 years ago

Got it - but it's definitely worth flagging oddities like this that you do notice just because you're coming at the material from a distinct point of view and are likely to catch things we've become numb to.