geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

IUPHAR drug mappings missing in 2021-05-01 released reacto.owl #136

Closed dustine32 closed 2 years ago

dustine32 commented 3 years ago

This is coming out of https://github.com/geneontology/pathways2GO/pull/135.

The test testDrugReactionDeletion was failing because the reacto.owl (pulled via http://purl.obolibrary.org/obo/go/extensions/reacto.owl) was missing IUPHAR mappings to molecules involved in reaction R-HSA-9674015. In fact, there were zero IUPHAR mappings in the 2021-05-01 reacto.owl used in the test.

We will need to debug the reacto.owl build code to figure out how these IUPHAR mappings went missing.

dustine32 commented 2 years ago

@deustp01 I have an old copy of Reactome release 74 Homo_sapiens.owl that has 375 IUPHAR mappings. Compare this to the latest release 80 file which has zero. Do you know if there's a reason that these IUPHAR mappings are no longer in the Reactome BioPAX releases? We may have to source these from somewhere else.

deustp01 commented 2 years ago

@dustine32 I have asked locally about the history here. We may have switched our external reference DB for these instances from IUPHAR to Guide to Pharmacology (G2P). I'll let you know.

dustine32 commented 2 years ago

@deustp01 Ahhh, great, this should be a simple fix for me then! I'll just add "Guide to Pharmacology" to the "is it a drug?" code. I confirmed these mappings are in the release 80 BioPAX for our example drugs (Antabuse, olaratumab).

Specifically, we'll need to update code here: https://github.com/geneontology/pathways2GO/blob/9ac81db73cfff952739b6701203132a05837fb80/exchange/src/main/java/org/geneontology/gocam/exchange/BioPaxtoGO.java#L974 and here: https://github.com/geneontology/pathways2GO/blob/9ac81db73cfff952739b6701203132a05837fb80/exchange/src/main/java/org/geneontology/gocam/exchange/PhysicalEntityOntologyBuilder.java#L1034

ukemi commented 2 years ago

@deustp01 Did you review this in the latest load? I haven't yet. If it passes, we can close and move the ticket along in the ticket along in the project.

deustp01 commented 2 years ago

@deustp01 Did you review this in the latest load?

I will try to figure out what instances failed before. This comment from #160 looks promising as a lead to problems -

In gomodel:R-HSA-5620971, this activity unit corresponds to a reaction with a drug molecule (antabuse) as an input. Isn't the model-building process supposed to look for reactions with entities whose schema class is "ChemicalDrug" (also "ProteinDrug") and omit those reactions from GO-CAM models?

and recheck. I'm moving slowly.

ukemi commented 2 years ago

No worries. Just didn't want it to fall off the page. We should check a Reactome reaction that is activated or inhibited by a drug.

ukemi commented 2 years ago

In the pathway Aspirin ADME (R-HSA-9749641), there is a step where ASA dissolves ( R-HSA-9757434). If look at the model, I don't see this step. This is at least one confirmation that the aspirin step is being filtered out. The next transport step (R-HSA-9749607) is there, but if I am interpreting the pathway correctly, the ASA- is now being considered a substrate chemical, so this should pass and it is indeed in the model.

In the RA biosynthesis pathway (R-HSA-5365859) there is a drug step 'Acitretin binds to RAR:RXR--R-HSA-9009817'. I do not see this in the imported model.

For the ADME pathways (R-HSA-974878), the imports look a bit odd because it looks like all the drug reactions are being filtered out. There are lots of orphaned xenobiotic metabolism nodes. I think this is something for a later day.

deustp01 commented 2 years ago

In gomodel:R-HSA-5620971, this activity unit corresponds to a reaction with a drug molecule (antabuse) as an input. Isn't the model-building process supposed to look for reactions with entities whose schema class is "ChemicalDrug" (also "ProteinDrug") and omit those reactions from GO-CAM models?

Now, this reaction / activity unit is missing form the GO-CAM for the pathway, as it should be. One for a different drug elsewhere in the pathway, that I did not check previously, is present - more checking needed later to see if this is a different kind of error, but IGNORE for now, I think.

For the future - a very large number of reactions in which a drug binds a protein involved in a normal pathway and changes its behavior, hae just been released (Reactome version 81, June 2022). These all should be excluded from GO-CAMs according to current rules, so we will need a systematic "say no to drugs" QA for the version 82 GO-CAM build.

deustp01 commented 2 years ago

In the pathway Aspirin ADME (R-HSA-9749641), there is a step where ASA dissolves ...

Your interpretations looks good to me. Also, editorially, while the summation blurb says useful things that should be preserved somewhere, maybe in the overview summation for the whole pathway, and while the interaction of protonated aspirin with water to yield ionized aspirin plus water plus a proton is formally correct, I don't see any gain fromthis step - simpler to start with ionized aspirin outside the cell and ready for transport.

ukemi commented 2 years ago

So for the aspects of this ticket, I think it looks like things are working as we expect (drugs are being screened out) and there are a few edge cases that we need to investigate. If you agree, should we close this one?

deustp01 commented 2 years ago

ASA- is now being considered a substrate chemical, so this should pass and it is indeed in the model.

I hope so, and this is the effect we want

For the ADME pathways (R-HSA-974878), the imports look a bit odd

R-HSA-9748787, right? Yes. In this pathway, the curator has assigned a GO BP term to the whole pathway (xenobiotic catabolism) and also assigned more specific GO BP terms to each of the reactions in the pathway (xeno transport; xeno catabolism, etc.). The GO-CAM converter may be picking up these annotations to create the "orphan" nodes. These in turn appear to have a 'art_of connection to the corresponding activity unit where that activity survived filtering, or to be present as unconnected orphans where filtering suppressed the reaction. Consistent with all of this, the one reaction for which the curator did not assign a reaction-level BP term has no part_of link to a xenobiotic BP term.

TO DO for now - nothing, I think.

TO DO for version 82 - revisit our curation practice and the GO-CAM conversion process. I suspect that we want to be able to assign broader BP terms to pathways and more specific BP terms to reactions contained in those pathways, but we don't always want to do this, so we will need to discuss with Dustin how to do this without confusing the GO-CAM conversion process.

deustp01 commented 2 years ago

In the RA biosynthesis pathway (R-HSA-5365859) there is a drug step 'Acitretin binds to RAR:RXR--R-HSA-9009817'. I do not see this in the imported model.

That reaction, alas, was only released last week, so it's to new to be in the BioPax Dustin is using now. It's one of the many drug reactions we will need to check carefully as part of the version 82 GO-CAM build.

deustp01 commented 2 years ago

So for the aspects of this ticket, I think it looks like things are working as we expect (drugs are being screened out) and there are a few edge cases that we need to investigate. If you agree, should we close this one?

Yes, time to close.