geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

The direction of a transport reaction is backwards in the new import #270

Closed ukemi closed 8 months ago

ukemi commented 11 months ago

In this model: gomodel:R-HSA-70268 (Pyruvate metabolism) There is a transport reaction: R-HSA-9012374 (VDAC1 transports PYR from cytosol to mitochondrial intermembrane space) There reaction transports pyruvate from the [GO:0005829] cytosol to the [GO:0005758] mitochondrial intermembrane space. I looked at Reactome and the locations are correct wrt inputs and outputs, but in the latest import, the direction is backwards.

dustine32 commented 10 months ago

@ukemi It looks like this is due to the switch to ChEBI IDs for small molecules, away from the location-specific REACTO IDs.

Before the CHEBI switch, start location cytosol ID R-ALL-29398 and end location mitochondrial intermembrane space ID R-ALL-9012375 the Reactome IDs were being incorporated into the GO-CAM IRIs for PYR like:

<http://model.geneontology.org/R-ALL-29398_R-HSA-9012374>
<http://model.geneontology.org/R-ALL-9012375_R-HSA-9012374>

The CHEBI switch automatically meant that both of the above IRIs would change to this identical CHEBI-derived IRI <http://model.geneontology.org/CHEBI_15361_R-HSA-9012374>, causing an ID collision and confusing the conversion on which location was the input (start) and which was the output (end).

I'll need to play with how the IRIs are constructed to make them unique again.

ukemi commented 10 months ago

Argh. I see what you mean. Or perhaps deconstruct the IRI's to include a location in the OWL expression?

ukemi commented 10 months ago

Yay! Thanks @dustine32 Should I double check before I close? Are the models on production or development?

dustine32 commented 10 months ago

@ukemi They'll be in noctua-dev (very most likely) by the end of today

ukemi commented 10 months ago

Perfect! I will check and then give the thumbs up to push to production. if that's ok. @deustp01 do you mind if we push this bug fix between releases?

deustp01 commented 10 months ago

do you mind if we push this bug fix between releases?

If it works for Dustin, it's fine for me.

dustine32 commented 10 months ago

@ukemi @deustp01 The updated models are now in noctua-dev.

ukemi commented 10 months ago

It looks like in the above model, all of the transport reactions have been corrected. However, to be comprehensive I checked a few other transport pathways. For R-HSA-432040,R-HSA-432047 and R-HSA-917937all looks good. In R-HSA-432030 it looks like we are not representing transport as we are in the other cases. I wonder if this is because it is a channel activity rather than a transporter activity??? I will open a new ticket for this. I think for this ticket we can merge into production and close. Nice work @dustine32

dustine32 commented 10 months ago

@ukemi Reaction R-HSA-432074 in pathway R-HSA-432030 isn't detected as a transport reaction due to a much more boring reason than suspected: case sensitivity of "Glycerol" vs "glycerol" in the Reactome (and thus BioPAX) labels: image It turns out that exact matching on labels is used by the transport inference SPARQL query for confirming two individuals have the same type: https://github.com/geneontology/pathways2GO/blob/4e2f4c711d076e6622451aeebd01957a17cd1949/exchange/src/main/resources/org/geneontology/gocam/exchange/query2update_localization.rq#L35 This used to be necessary when we used REACTO class types for small molecules since the types were location-specific. But, after we switched to using ChEBI classes we can probably get rid of this "labels must match" criteria. Does this sound good?

deustp01 commented 10 months ago

Formally a tangent, but even if GO-CAM can tolerate these mismatches, users at Reactome (both human and machine) would probably prefer that we cleaned them up. @dustine32 if it's easy can you generate a list of things that fail the "labels must match" test, even one with some false positives and false negatives?

ukemi commented 10 months ago

@dustine32 Yes, let's use the ChEBI identifiers. I suspected the reason was something like this. @deustp01 I agree clean-up even for minor things like case mis-matches is probably a good goal.

ukemi commented 10 months ago

I still think we can merge this and then close and move it over to the appropriate column in the project.