NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

Looks Like CQS Results in the UI are returned by KPs as "Reasoning Agents" #805

Closed sstemann closed 4 months ago

sstemann commented 4 months ago

This is one, example for any MVP1 that CQS responds to you will see this behavior:

  1. In UI Test (I saw similar behavior in CI), run MVP1 > Cystic Fibrosis

https://ui.test.transltr.io/main/results?l=Cystic%20Fibrosis&i=MONDO:0009061&t=0&r=0&q=8b6ff95b-2a6a-4d34-a42f-a6a1a01f8ab8

PK: 8b6ff95b-2a6a-4d34-a42f-a6a1a01f8ab8

Notice - In the UI - Reasoning Agent now has options for Service Provider and MolePro - prior to enabling CQS, only ARAs were options in the Reasoning Agent facet

image
  1. Plop that PK in the ARAX GUI and notice CQS responds (but isnt listed in the UI-UI as a Reasoning Agent)

    image
  2. It's easiest to find a somewhat isolate result (ie from MolePro but not from other Reasoning Agents) - so Idrevloride

image
  1. Which looks like this in the ARAX GUI - there are two results, 114 and 639 The Result Graph for 639 with 1 Support Graph

    image
  2. The edge support graph

    image

I dont know whats going on or where the problem is, but this can't be right.

karafecho commented 4 months ago

Thanks for pointing this out, Sarah.

This is a packed issue that needs to get broken down a bit, but for now, I'll note a few things:

  1. I'm tagging @jdr0887 as he is the lead developer of the CQS.
  2. I'm not sure why the UI is displaying Molecular Data Provider and Service Provider TRAPI as "Reasoning Agents" instead of the CQS. We (Matt, Sierra, Max, Jason, me) thoroughly tested the CQS in CI, including checking the structure of the CQS JSON output, and everything seemed fine.
  3. I'm not sure why there are two results for idrevloride, but the predicates in the edges for the support graphs in the 114 result don't look right to me. That's the only duplicate I see, so my guess is that this is not a global issue.
  4. The 639 result and supporting graph look correct to me. (This result is reflective of one of the CQS treats refactor templates.)
karafecho commented 4 months ago

Per Gus and Andy (Slack), if the CQS is acting like an ARA, then it needs to the primary knowledge source on inferred edges. That is what is causing MolePro and TRAPI Service Provider to show up as reasoning agents.

Looping in @mbrush, as I'm not sure the suggestion above fit with the "treats refactor" model that he had in mind for the CQS. See this G-sheet.

To Sarah's question about where the CQS results are originating, the CQS is currently supporting 8 "treats refactor" templates (see https://github.com/TranslatorSRI/CQS/tree/main/templates/mvp1-templates), so results from the curated templates for which the CQS was conceptualized (e.g., mvp1-template1-clinical-kps) are being drowned out by the "treats refactor" results. We are considering approaches to rectify this unanticipated effect.

We have a CQS HELP desk call on Friday at 1 pm ET, if you all wish to discuss Sarah's issue then.

gprice1129 commented 4 months ago

I see what you're saying @karafecho. This is confusing to me @mbrush. I'm not sure why the primary_knowledge_source for an edge generated by the CQS would ever by anything other than the CQS.

karafecho commented 4 months ago

To clarify my last post and hopefully resolve some confusion expressed in Slack, the issue re primary vs aggregator knowledge source is something that Matt and the UI team need to resolve, but the fact that the CQS is returning responses from, e.g., MolePro, is not an issue but rather an expected result. The fact that results from the curated templates for which the CQS was conceptualized (e.g., mvp1-template1-clinical-kps) are being drowned out by the "treats refactor" results also is not an issue per se, but rather a consequence of scoring that I'm not thrilled with (i.e., as with other ARAs, the "treats refactor" results are based on one-hop inferences and so are being scored higher than the more complex curated results). We are developing alternative approaches such as multiple instances of the CQS to address the scoring impact.

mbrush commented 4 months ago

why the primary_knowledge_source for an edge generated by the CQS would ever by anything other than the CQS

This was widely discussed and we felt that since it is the KP that does all the work to define when and how to generate these one-hop predictions that resulted from the 'treats' refactor, and they only use the CQS to 'write down their answer' - that the KP should be credited as the primary source. We did not anticipate the implications this would have in the UI.

I don't think anyone felt so strongly about whether the KP or the CQS was the primary source - so it may be that the simple fix to this is to have the CQS make themselves the primary source, and the KP a supporting data source - so they are still credited for their work.

Alternatively, we keep primary source the KP, but the UI implements a rule that displays the CQS in their 'Reasoning Agent' filter for any prediction coming form the CQS service (regardless of what the TRAPI says is the primary source).

karafecho commented 4 months ago

Following up from the CQS HELP desk discussion with Matt, Sierra, Gus, and Jason, here are the action items:

  1. Jason will update the "treats refactor" templates such that the CQS is the primary knowledge source and the KPs that provide the inference rules are the supporting data sources, contingent on Matt's confirmation that the contributing KPs agree to this plan
  2. Jason will update CQS results to indicate KL as "prediction" and AT as "computational model"
  3. [Unrelated to this ticket, but worth noting, just to avoid any further confusion] Jason will update CQS code and templates such that it copies over "biolink:publications" and "biolink:max_research_phase" for the "treats refactor" edges within an attribute block, for propagation from foundational edges contributed by supporting data source(s)
    ,
    "attribute_type_ids": [
      "biolink:publications",
      "biolink:max_research_phase"
    ]
    }
  4. Matt/Sierra will coordinate with relevant KPs regarding non-CQS KP-specific issues re the "treats" refactor (e.g., redundant ChEMBLs)
    • Part of this effort will involve migrating Clinical Trials KP to Automat and deprecating all of the ChEMBL edges
  5. Kara will bring proposal to Architecture for modified CQS scoring approach (e.g., apply weights to scores for various templates, score top X% results from curated templates higher, create multiple instances of the CQS to support different use cases) in order to surface results from the curated templates that are currently being drowned out by the "treats refactor" results, the goal being to surface the curated template results in the UI so that we have an opportunity to solicit broader community feedback
karafecho commented 4 months ago

Update:

Action items (1), (2), and (3) from above post were completed on Friday, 6/28, with fixes tested in DEV on Monday, 7/1. An ITRB request to push to CI has been submitted.

Additional update:

Confirmed on Tuesday, 7/2 that ITRB pushed the latest version of the CQS to CI.

jdr0887 commented 4 months ago

As CQS has 1, 2, & 3 rectified from the bullets above, is there any reason the ui.ci.transltr.io app would not now list/include CQS?

karafecho commented 4 months ago

@sstemann and others: The CI environment appears to be very unstable right now. I'm receiving either errors without PKs, incomplete results (meaning the UI never finishes running a query), or weird results that seem to reflect the older instance of the CQS (the version before the one that has been confirmed as deployed in CI). According to Mark, Shervin and ITRB are working on some Jaeger deployment issues in CI, which appear to be the source of the UI CI issues. As such, I don't think any additional testing on our end is warranted until @ShervinAbd92 confirms that the CI environment is stable. Thanks, all!

ShervinAbd92 commented 4 months ago

Hi @karafecho, ARS is stable now and deployed on Jaeger. we still have to figure out how to show async tool traces under their initial trace rather than separated on their own. I will let you know in outages channel when we are going to try for that. you can resume testing on ARS.

karafecho commented 4 months ago

Great! Thanks for the update, Shervin. I did not know there was an outages channel. I just joined.

karafecho commented 4 months ago

Update:

With the CI stability issues and an additional ARS issue resolved, I went ahead and reran the cystic fibrosis query at ui.ci.tranlstr.io. Here is my summary of findings:

I am not exactly sure what's going on, but I think it would be helpful to have the UI team take a look. @gprice1129 @dnsmith124, @Genomewide, @NishadPrakash, perhaps you all can chime in?

sstemann commented 4 months ago

when i look at the EPC of the same result from CQS and from BTE - edge sources seem to be mapped differently - I believe ARAs populate aggregator_knowledge_source with their ARA. If CQS needs this to be mapped differently, that should be a discussion to change this standard. While I did CQS vs BTE below, you can see that ARAX also follows this standard if you compare tobramycin between CQS and ARAX. Mannitol (one of the many edge support graphs) from ARAGORN goes through MolePro and you can see the difference in EPC modeling between CQS.

If aggregator_knowledge_source is not required on the inferred path - I think this affects all other ARAs.

Inferred Path CQS BTE
Predicate Treats In Clinical Trials For
primary_knowledge_source: infores: cqs infores:chembl
supporting_data_source: service-provider-trapi
aggregator_knowledge_source mychem-info
aggregator_knowledge_source biothings-explorer
Edge Support Path CQS BTE
Predicate In Clinical Trials For None
primary_knowledge_source: chembl
supporting_data_source:
aggregator_knowledge_source arax
aggregator_knowledge_source mychem-info
aggregator_knowledge_source service-provider-trapi
aggregator_knowledge_source cqs

CQS image

image

BTE image

mbrush commented 4 months ago

Hi. I've said this before and will say it again now - I think we should pause on testing/reporting any issues for MVP1 related to results based on Chembl and their ingest of clinical trials data. It has become clear from talking with KPs and ARAs that teams are at different stages or implementing the treats refactor (e.g. KG@ has yet to implement), and some inconsistencies remain in how it is being done (e.g. SPOKE not reporting clinical Trials ids appropriately). On top of this, we know that these refactored chembl ingests will all be replaced by the CTKP in the next month or so anyway. Trying to QA during this transition period is IMO not a good use of our time - given that the CTKP swap-out will result in things looking very different (in a good way - cleaner, less duplication, less inconsistency) - and many issues that testers are painstakingly documented will be fixed as a result of this swap-out.

I propose that QA be directed to MVP2 over the next month, and we make a concerted push get the CTKP swap out complete ASAP. Once this has been done and made its way into test, we can shift testing focus back to MVP1.

I am happy to help coordinate the moving parts here - KPs needing to swap out their chembl ingests, Gwenlyn/Multiomics in their CTKP implementation and move over to automat, the CQS team in ensuring that 'treats' predictions continue to get generated correctly, and proper EPC reported after the swap.

Maybe we can discuss on the next TACT call - and agree on a plan that will be most efficient here? I copied my comments above into an issue in the TACT repo to ensure we follow up with it there.

karafecho commented 4 months ago

+1 to Matt's comments and proposals. I had suggested a small-group meeting in Slack, but I think Matt's suggestion of bringing the issue(s) to TACT makes more sense.

gprice1129 commented 4 months ago

@jdr0887 in our last call I had told you that the primary_knowledge_source on the edges are the only thing that needs to change for the CQS to show up in the facets but that is incorrect (so sorry about this!). The resource_id attached to the analysis also needs to change to infores:cqs

sierra-moxon commented 4 months ago

@mbrush - Can we work out the CQS provenance issue so that we can disambiguate this issue from the issue with redundant Chembl ingests?

karafecho commented 4 months ago

Jason addressed all CQS provenance-related issues that were raised in this thread. The other issues relate to ChEMBL and the treats refactor. As such, I'm closing this issue.