Open cbizon opened 4 months ago
One option is to just modify the rule to be what we think it should be by reversing the directions of the chemical/gene edges. It might be that if the TMKP edges were in the data correctly that we would have found that rule.
We want to try removing TMKP. This will involve adding TMKP to the deny list for increases/decreases edges...
@cbizon here's an example query with the denylist
:
{
"message": {
"query_graph": {
"edges": {
"e0": {
"predicates": [
"biolink:treats"
],
"subject": "n0",
"object": "n1",
"provided_by": {
"denylist": [
"infores:tmkp"
]
}
}
},
"nodes": {
"n0": {
"ids": [
"MONDO:0004979"
],
"is_set": false
},
"n1": {
"categories": [
"biolink:SmallMolecule"
],
"is_set": false
}
}
}
}
}
Starting from https://github.com/NCATSTranslator/Feedback/issues/879
The rule is (chemical a)<- decreases - (gene g) -decreases-> (chemical b) -treats-> (disease d)
The rule is coming from our mining. But it does produce odd results like the one listed above.
Some of the problem here seems to be that TMKP has a lot of its edges reversed. So if a paper says "chemical decreases gene' then it gets into TMKP and robokop as "gene decreases chemical". To what effect this causes the rule to exist, I am unsure, but it is in there. It may not be the relevant feature, as there are other similar rules that are for sure driven by CTD.
Interestingly, we don't have a rule going the other direction, which I think would be much more easily supportable from a meaning standpoint if not from the statistics...