NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

missing (expected) compounds in results #365

Open Rosinaweber opened 1 year ago

Rosinaweber commented 1 year ago

Aggressive systemic mastocytosis (ASM) is a rare disease. I have a family member who was diagnosed with it on 06/09/23. ChEMBL has two FDA approved drugs prescribed for this: MIDOSTAURIN (Approved in 2017) and AVAPRITINIB or AYVAKIT (approved in 2022). The third drug is currently in clinical trial: bezuclastinib (CGT9486), ChEMBL has this as a compound.

ARAX: out of 10 results, it includes MIDOSTAURIN, two drugs for cancer and steroids IMPROVING: out of 300 drugs, MIDOSTAURIN is the only one with score 1. BTW: out of 500 results, MIDOSTAURIN is 370. UNSECRET and ARAGORN do not include MIDOSTAURIN or any others. No ARA has AVAPRITINIB (or AYVAKIT) but MOLEPRO produces it as a result. No agent in Translator has bezuclastinib (CGT9486).

This may be a useful use case. Please let me know if you have any questions.

sierra-moxon commented 1 year ago

@sandrine-m - is it possible that MolePro has more information on the treatments @Rosinaweber suggests? I'm not sure who to assign this to nor if anything here is fixable by September. any ideas?

suihuang-ISB commented 1 year ago

@Rosinaweber : Manual check in SPOKE, which imPROVING uses, shows that Midostaurin is FDA approved (Phase 4), whereas Masitinib also in SPOKE is in only in Phase 3 (not yet approved, and approval denied by EMA (Europe). Perhaps that is why we have only the old drug Midostaurin. imPROVING uses the Phase for ranking. By contrast AVAPRITINIB was just approved, and is missing in SPOKE because it gets the info from ChEMBL which points to a single clinical trial that is only in Phase 2.
Moreover, for some reason AVARPPRINIB has been entered in SPOKE only with the chemical name, which could have been missed in the search among the returned results.

sandrine-m commented 1 year ago

@sierra-moxon regarding data scouting of diseases--compounds: diseases--compounds are partially in the MolePro scope: we do report some diseases as part of the clinical trials data, but do not perform targeted data scouting on disease--compounds associations (in order not to overlap with other KP).

@Rosinaweber could you please share the PK full link of your search please?

I can see a few issues here:

Bezuclastinib name resolution:

It looks to me that MolePro is aware/exposing to Translator all the info it knows about bezuclastinib. ARAX UI synonyms is only reporting UMLS and NCIT (and the nodes are not mapping to the same class in Biolink within the ChemicalEntity hierarchy). They all seem to come from KG2.

Name resolver has the same output than ARAX UI (CI):

{
  "UMLS:C5554531": [
    "Bezuclastinib"
  ],
  "UNII:2ROQ545LAG": [
    "BEZUCLASTINIB"
  ],
  "PUBCHEM.COMPOUND:75593308": [
    "bezuclastinib",
    "Bezuclastinib"
  ]
}

The lack of complete mapping resolution can lead to difficulties in finding it at the result level. @edeutsch @gaurav would you agree with this analysis?

Environment for testing / ranking

I retested on UI test: PK : c4f80468-f3d8-45bc-8853-7fdf93a16aa2

Midostaurin : no. 2 Avapritinib: appears in a path no. 36 bezuclastinib: -

I ran a retest on ARAX UI (both test and CI), Midostaurin and Avapritinib rank pretty high (note that the ranking function implementation/deployment deadline is 07/14):

image
Rosinaweber commented 1 year ago

Thank you, @suihuang-ISB . This is very clarifying. Thank you, @sandrine-m Here's the main PK: 6c7fe853-0cb3-43d5-ae00-74013955d135

sandrine-m commented 1 year ago

I am unsure what are the requests here. Is the request about: (1) how to improve quality of the answer? (2) why Molepro has info that ARA do not have on bezuclastinib? (3) giving a use case to test performance of ranking?

I made a retest today. Here is the CI.

Rosinaweber commented 1 year ago

Thanks, @sandrine-m
I believe these are questions for the ARAs, not for MolePro. MolePro include this drug, but ARAs do not.

Rosinaweber commented 1 year ago

@Rosinaweber : Manual check in SPOKE, which imPROVING uses, shows that Midostaurin is FDA approved (Phase 4), whereas Masitinib also in SPOKE is in only in Phase 3 (not yet approved, and approval denied by EMA (Europe). Perhaps that is why we have only the old drug Midostaurin. imPROVING uses the Phase for ranking. By contrast AVAPRITINIB was just approved, and is missing in SPOKE because it gets the info from ChEMBL which points to a single clinical trial that is only in Phase 2. Moreover, for some reason AVARPPRINIB has been entered in SPOKE only with the chemical name, which could have been missed in the search among the returned results.

Hi @suihuang-ISB Even if not approved in Europe, it is currently undergoing clinical trial and is being used by many patients, which in a rare disease, may mean a good proportion of patients. This would be what our novelty score should identify as novel, but it would not be able to do it if it does not appear in the results.

sandrine-m commented 1 year ago

thank you @Rosinaweber that is clearer. I unassigned MolePro team. Changed title. So my understanding is that you, as a user, was expecting to get bezuclastinib back as a response somewhere but it does not show up. I dig this a little further and found that the clinical trial about this compound is still ongoing and not reported in ChEMBL. The question is therefore has Translator knowledge about the clinical trial? I ran the following query to ask what is known about bezuclastinib (UMLS:C5554531) and Aggressive systemic mastocytosis (MONDO:0020333):

{
   "edges": {
      "e00": {
         "subject":   "n00",
         "object":    "n01",
         "predicates": ["biolink:related_to"]
      }
   },
   "nodes": {
      "n00": {
         "ids":        ["UMLS:C5554531"]
      },
      "n01": {
         "ids":  ["MONDO:0020333"]
      }
   }
}

Just to be 100% sure the normalization did not miss anything, I rerun the query with the PubChem ID and it leads to the same result.

No KP reports the clinical trial so I guess it is expected that ARAs do not have it in their results?

Assigning this issue to Gwênlyn as she might be able to help further.

Rosinaweber commented 1 year ago

@gglusman Hi Gwênlyn , when you have a chance, would you please let us know what your thoughts are about this issue? The fundamental question is whether ARAs are effectively searching and finding novel drugs. Our concern is that the majority of the ARAs may be missing something. If this is a common fact, then we need to do something about it. If this is an outlier, as it has been suggested, then we are fine and can close this issue. Thanks.

gglusman commented 1 year ago

Sorry, I hadn't seen the 'assigned' email from three weeks ago. The current version of our Clinical Trials KG indeed doesn't include NCT04996875, or in fact any trials involving bezuclastinib. Looking into it with @GitHubbit .

sstemann commented 3 months ago

In the UI - we do return MIDOSTAURIN (score 5.0, rank highest), AVAPRITINIB (score 4.53),

https://ui.transltr.io/main/results?l=Aggressive%20Systemic%20Mastocytosis&i=MONDO:0020333&t=0&r=0&q=4fba3898-e8bb-46aa-89ae-a5cc26b1fa82

ARAs do not return Bezuclastinib in MVP1.

In MVP1 MIDOSTAURIN support paths include Midostaurin Affects Kit

image

It looks like other results in MVP1 are returned with support paths "Drug Affects KIT".

https://ui.transltr.io/main/results?l=Bezuclastinib&i=PUBCHEM.COMPOUND:75593308&t=4&r=0&q=e3560428-89ea-4ad3-96ea-bb099e6cd34a

image

If you run MVP2, what genes may be downregulated by Bezuclastinib, the second result is KIT. WHich is to say, I think the relationships are in the Translator network that Bezuclastinib may treat ASM.

I'm tagging BTE, because it's their top answer in that MVP2.