Closed goodb closed 4 years ago
@ukemi @deustp01 quick check in here before I delve into this. Are you two still certain that we want to eliminate reactions that involve drugs from the conversion process? It seems to me that it reduces the value of the resulting GO-CAMs and certainly adds another degree of complexity to the code. Obvious questions of high value like 'what downstream effects might this drug have' are now impossible to answer with the reactome-generated go-cam knowledge base.
I think it makes sense to continue keeping the disease pathways out of the conversion process for now because we don't have a good representation for broken genes or processes in GO-CAM. But could you remind me what about including reactions involving drugs produces a problem for GO-CAM ?
We eliminate them because strictly they are out of scope for GO. The pathways and reactions that include drugs are not 'normal ' pathways and may as a result not reflect the overall 'normal' role of gene products that are also in those pathways or reactions. But I do agree, those and the disease pathways are extremely valuable to our overall mission. BIOMED-CAM. Another grant proposal!
It definitely makes the kinds of questions I've been toying with impossible without asking the query after it has been manually distilled wrt the gene products with which drugs interact. (already done if we were to keep them in)
We eliminate them because strictly they are out of scope for GO
I don't see a way around the scope argument.
Naive biologist idea: drugs can be transported and metabolized and do other things as well, but now and for a while to come, the only way that Reactome annotates a drug is to create a reaction in which the drug binds to a protein (normal or genetic mutant) to yield an inert protein - drug complex, and that complex is then annotated as a negative regulator of whatever functions the unbound protein has. As a result of this limitation, no information would be lost if a user was able to view any GO-CAMs involving that protein with the relevant activity units deleted. (This same functionality would enable a user to use GO-CAMs to ask what possible effects of a loss-of-function mutation in a protein are.)
This would avoid discussions of whether there is a space for drugs in GO, but no clue here whether it's practical to implement. It's easy in a Reactome pathway view because we already have display tools to render the normal pathway with big red X's superimposed on the protein(s) whose function has been lost - look at diseases of metabolism for examples.
Great idea, but the scope issue still looms over this I think. I agree with you that one of the major use cases for the models will be to look at downstream effects of perturbations, whether by pharmacological means or genetic means.
Another way to say it is to ask if we could have an overlay feature for viewing GO-CAMs (maybe overlay on the Cytoscape view of them)
Another plus to having this kind of overlay is to expand this discussion to look at regulatory interactions in general, echoing back to the sometimes ATP can be a drug discussion. Also expand to disease views?
Sorry to be a downer, but new graphical overlays in the context of Noctua seem years, grants, and developers away from where we are right now. It doesn't hurt to think about them, especially with the new GO grant about to be written, but we shouldn't do anything in the conversion now that assumes they will happen.
That being said, let me try one last time from my usual tack. If we were to properly add the reaction Olaratumab binds PDGFRA into the GO-CAM for its home pathway Signaling by PDGF, what would be the negative consequences in terms of GO use cases?
Defining the scope of the GO project as 'normal biology' is helpful as it guides our work such that it synergizes well with other efforts. This is most important for curators who have an essentially infinite workload. But for an import like this we are taking the product of an external resource that has already done the work. As long as the content doesn't require us to change our ontologies/data-model/software I really do not understand why we would spend more time and energy keeping it out of the system then it takes to have it in there.
Part of my pushback here is that I'm looking at the conversion code from the perspective of re-using it on other resources (like YeastCyc https://github.com/geneontology/go-ontology/issues/20091 ). And I am realizing that we have accumulated quite a bit of Reactome specific routines in there and the more of these there are, the less generalizable the framework is. e.g. another resource may include drugs as chemicals and not use IUPHAR ids in the same way that Reactome does. That filter would thus not work properly on that other resource. But, if we back it off and just go ahead and build Molecular Event nodes with chemicals as inputs where those chemicals may or may not contain manufactured drugs, things are much more likely to work.
I think your arguments are all good. It seems like this should be a PI decision. I can certainly see its utility.
so a question.... how complete are the annotations? i.e. is the drug data consistently captured in Reactome? and a general question ... how often would these CAM models be refreshed from Reactome? what is the expectation? Is there any sense that we can 'keep up with' drug bindings? what is REactome's experience with drugs recalled from the market, etc. oh, maybe that's 'questions'.
The CAM models would be refreshed with every Reactome release. @deustp01 can you shed light on the other questions?
Code is ready either way here. Please advise as it impacts counts needed for manuscript.
@judyblake
is the drug data consistently captured in Reactome?
We are annotating groups of interesting drugs, starting with drugs that affect blood coagulation, now including everything we can find in the literature and that B Shoichet thinks is plausible that affects SARS-CoV-2 infection and host responses to it. So it's a skewed small sample, distinct from operations like CHEMBL or PharmGKG or DrugCentral that aim for comprehensive coverage. So we haven't even addressed the issue of "keeping up".
what is Reactome's experience with drugs recalled from the market
We strongly prefer to annotate drugs that are approved for clinical use somewhere but this is not an absolute requirement. Unlike those other resources, our target users are researchers - could be translational research - but not medical practitioners. So we in fact do annotate, e.g., some kinase inhibitors that target mutant forms of kinases but that are too toxic for clinical use in humans.
what is Reactome's experience with drugs recalled from the market
Even then, it might be useful to have information about the drugs because they may come into fashion later, thalidomide.
I think @goodb has an excellent point above. If the information is of high quality, is there for the taking, requires effort to ignore, then why not have it? We have looked and worked with Reactome enough to know the quality. However, I would not recommend wholesale import of resources without working with them closely first. My 2 cents.
what is Reactome's experience with drugs recalled from the market
Reactome's focus is approved drugs. Where a group of drugs have pharmacological action on a target, we place that group of drugs into a "defined set" where all drugs are members. There are some occasions where we've curated experimental drugs that are in late stage clinical trials. Since these drugs are not fully approved yet (but the hope is that they will be), we place these drugs in a "candidate set" where approved drugs are members and these experimental drugs are candidates.
Recalled drugs, if we felt we needed to include them, would fall into a candidate set of drugs. However, we steer clear of such drugs, even as they still display the pharmacological activity at a target, they have been withdrawn for adverse effects.
Even then, it might be useful to have information about the drugs because they may come into fashion later, thalidomide.
True, but as mentioned above, the focus is on approved drugs.
I think @goodb has an excellent point above. If the information is of high quality, is there for the taking, requires effort to ignore, then why not have it? We have looked and worked with Reactome enough to know the quality. However, I would not recommend wholesale import of resources without working with them closely first. My 2 cents.
In my humble opinion, the information is of high quality! We started drug curation with cardiovascular drugs (anticoagulants, anti-anginals, antihypertensives, antiarrhythmics). The basic premise is a drug binds a target and the resultant complex is a regulator of the target (positive or negative). This is how simple we'd like to keep drug curation. It doesn't affect the existing Reactome pathway, see it as an overlay of information.
Now we curate drugs for the main targets known to pharma (namely GPCRs and ion channels) because these targets have the biggest reserve of approved drugs and of late, we've also concentrated on potential therapeutics against Covid-19 infection. As we're doing these, we can also be swayed by collaborations which may ask us to curate specific targets so we're very flexible in our drug curation approach.
The conclusion reached offline here was that, for now, we will keep all reactions involving drugs out of the conversion as well as all reactions from disease pathways. Current code does this.
Example R-HSA-186797
It looks like some reactions e.g. R-HSA-9674095 have drugs in them as inputs/outputs but are not being removed. (As they should according to https://github.com/geneontology/pathways2GO/issues/84 ) In these cases, the reaction is getting converted but the drugs are just taken out.