Closed sstemann closed 1 week ago
Paging @dkoslicki and @chunyuma; this looks like a xDTD result
@sstemann what edge type should be used instead? Please pardon my ignorance.
I'm confused @sstemann , I was under the impression after the treats
refactor, that this was the correct (mixin) predicate
Per @sierra-moxon, mixin predicates are now allowed in TRAPI responses. I have repeatedly confirmed this with the SRI team.
If I am mistaken about the mixin matter, please DM me on Slack.
@saramsey @dkoslicki: It is totally fine to use mixin predicates in KGs. In fact, "treats" is also a mixin.
However, in most of the results from other ARAs, the "inferred" edge (knowledge_level: predicted) uses the "treats" predicate instead of the higher-level "treats_or_applied_or_studied_to_treat" predicate. Along with this "treats" inference edge, we typically see support_graphs (support paths in the UI) that show the edges that go into making that "treats" prediction. For example, we often see a "treats" edge inference from an ARA with support_paths/edges for that inference from TMKP. TMKP uses the "treats_or_applied_or_studied_to_treat" predicate directly, the ARA returns an inferred "treats" edge based on the "treats_or_applied_or_studied_to_treat" edge from TMKP.
Is the idea with this result that the attribute probability_treats
on this edge conveys some level of confidence that the "treats_or_applied_or_studied_to_treat" edge can be interpreted as a "treats" edge instead of instantiating the "treats" edge directly from ARAX?
@sierra-moxon we went with the treats_or_applied_or_studied_to_treat
predicates as our (inferred, not lookup) results are generated by a reinforcement learning approach, so we needed to pick the most general predicate we could as we can't a-priori guarantee that that it's not a "applied to treat" or "studied to treat" and just a "treats".
The probability_treats
attribute indeed exactly conveys the confidence (the ML model has) that it can be interpreted as a treats
edge
@dkoslicki Just in terms of how this is structured, we're asking a question ?-x->B, and we're looking for answers that are more or less likely to be true. That is we want answers of the form A-x->B, along with any support for that statement. If you have support for a different predicate y then that's only interesting A-y->B also supports A-x->B. Especially if y is a superpredicate of x.
So if you think that the treats_or_applied
predicate is the best representation then what would fit best (IMO) would be returning
A-treats->B (supported by) A-treats_or_whatever->B (supported by) more paths
I don't particularly think that middle layer buys you much, but I may not understand the subtleties of your approach.
Fixed in this PR: https://github.com/RTXteam/RTX/pull/2330
@cbizon it would be great to get some guidance on how the treats
refactor impacts MVP1. Our understanding might have been incorrect in replacing all treats
edges with the more generic mixin in both KG2 and ARAX.
I'm going to schedule a 30 min meetup on this with y'all so that we can address the issues around "treats."
Hi all. I jotted down my thoughts on these issues, as well as some more context around the 'treats' refactor. Looking forward to the call on Monday.
Prior to the refactor, KPs mapped knowledge from many sources to treats
where the source was actually reporting something more foundational - e.g. that a drug is in a phase 2 trial for a disease, or was self-reported to be taken for a disease by 20 patients. The treats
predicate was used incorrectly/imprecisely in many cases because its original definition was ambiguous/under-specified, and no other predicates were available. But these relationships do not meet our current criteria for what qualifies as a 'treats' assertion as defined in Biolink.
The treats refactor provided more precise predicates that allow us to express what these sources are actually reporting (e.g. in clinical trials for
, or applied to treat
). The slide deck here provides more info about the refactor and how to implement it.
To conform to the 'treats' and KL/AT refactors, KPs like KG2 needed to review their treats
edges and decide which can continue to use the treats
predicate as assertions (because they are consistent with the Biolink definition and requirements for asserting this relationship), and which should be 'downgraded' to use one of the new more foundational predicates. We provided the transform/mapping guide here to help with this (note that there are still a few sources that need to be explored and mapped, e.g. MONDO, NCIT, repoDB ).
It is indeed the case that the majority of treats
edges agree 'lost' in KPs after the treats refactor. The last piece of the puzzle is a way to 'regenerate' the lost treats
edges, but as predictions that can be made based on the more foundational edges they were replaced by.
Here is where the CQS comes into play: it decides when a treats
prediction may be warranted based on these more foundational facts, and creates these edges in response to creative mode queries. The CQS creates the treats
predicted edge, and a support path that consists of the foundational edge it was based on - so that this provenance can be presented in the UI using the existing support path paradigm. There are currently three predominant manifestations of this in our data:
treats
Y (prediction) supported by X in_clinical_trials_for
Y (assertion)treats
Y (prediction) supported by X applied to treat
Y (assertion)treats
Y (prediction) supported by X treats_or_studied_or_applied_to_treat
YThis last one in particular enables text mined edges that previously mapped to treats
to now use the weaker predicate treats_or_studied_or_applied_to_treat
that more accurately reflects the level of imprecision/uncertainty inherent in text-mined edges (we don’t know if these mined edges report a true treats
relationship, or merely the fact that a researcher was studying a possible treatment, or a patient/physician tried applying a treatment for their condition).
Note that the CQS is only responsible for generating treats
edges as predictions to make up for the fact that many direct treats
edges in KPs like KG2 were lost in the refactor (i.e. replaced with more foundational predicates). ARAs like ARAX continue to generate their predictions as before - but need to tune their templates / train their prediction models on KGs that now include these more foundational predicates.
applied to treat
or studied to treat
and just a treats
" (DK).IMO, while ARAX does use a unique methodology to make its MVP1 predictions, at the end of the day the edges it creates can be understood as 'predictions' that a treats
relationship may exist between the subject chemical and object condition. The whole idea of creative mode predictions is that we can create treats
edges and signal our lower certainty by tagging them with KL = 'prediction'.
The reasoner cannot be sure whether the relationship is treats
, or applied to treat
, or studied to treat
- but the fact that one of these is likely to exist is reason enough to make a 'treat' edge as a prediction. This is the same logic followed in making treats
predictions based on text-mined edges.
As Chris B said - ARAX could be super explicit about this in its creative treats
edges, and have two levels of nested support paths:
A-treats->B # arax secondary prediction - based on the xDTD prediction below
(supported by)
A-treats_or_applied_or_studied_to_treat->B # arax xDTD prediction
(supported by)
many support paths # explanatory paths generated post xDTD by the actor-critic network
But I agree this middle layer doesn’t buy us much, and is not necessary I think it is perfectly fine to ahve the ARAX xDTD directly predict treats
and follow the pattern used by all other ARAs:
A-treats->B # arax prediction from xDTD model directly predicts 'treats'
(supported by)
many support paths # explanatory paths generated post xDTD by the actor-critic network
Which is exactly as ARAX predictions looked before the refactor. The difference is that many of the edges in the support paths will be more informative because they more precisely express what their sources said in the first place.
treats
edges in Translator / RTX-KG2 is what they use to train their model.The fact that many of these treats
edges now use more precise/accurate predicates should not prevent training of the ML tool on the refactored graph, and the additional precision / distinctions they afford may even provide opportunities to more finely tune the model.
the original ticket is resolved in Fugu/Test. @mbrush if there is something in your notes that needs to be addressed, i suggest making a specific ticket.
this issue also applies to BTE
The inferred edge should be treats
looks resolved in prod
This is my understanding from the 7/12 TAQA review of the murals. It looks like primarily an issue with ARAX.
Environment: Test PK: 72895ed9-595c-4294-9e3d-53bd6ee2a8b5