NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

Edges with "in clinical trials for" don't seem to count towards the "Clinical Trials" type in the "Evidence" column for Translator results #846

Open saramsey opened 3 weeks ago

saramsey commented 3 weeks ago

In the ui.test.transltr.io system, just now, I ran a test query "What drugs may treat conditions related to Chron's Colitis?".

Here is a link to the results: https://ui.test.transltr.io/main/results?l=Crohn's%20Colitis&i=MONDO:0005532&t=0&r=0&q=3642d7b7-2879-4e61-99b2-6bbc78949ea4

I note that the first-ranked result, Etiprednol Dicloacetate, has an explanatory graph based on an in clinical trials for edge, as shown here: Screenshot 2024-07-05 at 1 14 36 PM

But the "Evidence" count table for this result doesn't reflect that in the count for "Clinical Trials", which seems counter-intuitive as a user: Screenshot 2024-07-05 at 1 15 46 PM

Would it not make sense to increment the "Clinical Trials" Evidence type count when there are results based on an in clinical trials for edge, regardless of the source?

sstemann commented 2 weeks ago

@brettasmi is it possible for IA/SPOKE to provide the CT in the EPC when using the In Clinical Trials For edge? https://clinicaltrials.gov/search?term=NCT00035503?

@dnsmith124 can you confirm how the Evidence > Clinical Trials (count) is claculated?

dnsmith124 commented 2 weeks ago

The evidence counts are calculated from the EPC available on all edges within a single result, in this case none of the result's edges, including the the in clinical trials for edge, had any EPC beyond 2 primary knowledge sources.

I agree with @sstemann, I think what makes the most sense here would be to provide the CT(s) in question when using that edge.

brettasmi commented 2 weeks ago

We are unable to provide the specific clinical trial at this time.

gglusman commented 2 weeks ago

...well, you could ;)

sstemann commented 2 weeks ago

my gut reaction is that if the edge reads "in clinical trials for", users would expect to see a CT in the evidence. I think that's @saramsey's reaction as well. @Genomewide could we run this by users? @sierra-moxon @mbrush was EPC for this edge discussed?

Genomewide commented 2 weeks ago

@sstemann I don't think we need to run this by any more users. I have heard this before when we have the "indciated for" facet meaning it is in a clinical trial for the selected disease. My hope is that Gwenlyn's service will replace this information or we can get more specific info on this type of edge.

My guess is that it has always been this way. It probably just used to say 'treats', and the refactor has drawn attention to this.

Here are the steps I did to get the specific information: go to the wiki page, click the link to Chembl, come back to our site to copy the name of the drug, search the drug in Chembl, go down to the Indication section, find the trial and click the reference to the clinical trial to get to the clinical trial page. This is just too much for a user if they have to look at multiple edges.

image

cartmanbeck commented 2 weeks ago

@brettasmi Can you elaborate for me on why we aren't able to find that clinical trial number at this time? It does look like the information is available through ChemBL based on this page: https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL2107614/ So is this something that just wasn't ingested when ChemBL was ingested? If that's the case, my first inclination is "you should go get that".

Alternatively, if we can add the needed information using Gwenlyn's new CT data source using the Annotator service, that's great, but I would still like to see the underlying KPs pulling this information if they're going to create an edge called "In_clinical_trail_for".

sierra-moxon commented 2 weeks ago

My understanding from the Architecture meeting on 7/2/24 was that KGs ingesting clinical trial edges via sources other than @gglusman's KG were planning (before the end of this Translator phase) to stop exposing those edges to Translator. Instead, they are/were planning to return results from @gglusman's KG (via ingestion into their KGs, API call, or just not serving this data and letting it come directly from Gwen's KG instead).

If I understand that discussion correctly, the only action for IA/SPOKE is to remove the Chembl ingest of clinical trial edges from their KG (or otherwise stop serving it to Translator queries)?

cartmanbeck commented 2 weeks ago

So in that case, then why would we need this to be included in the Annotator? If the edges themselves are going to be served by ARAs through a connected KP, then the edges will hold the needed information, and we won't need to do anything with Annotator. Correct?

sierra-moxon commented 2 weeks ago

That is my understanding, yes. @gglusman should comment here. I know she was trying to solicit feedback from KPs currently ingesting clinical trial data to ensure that her new KG adequately replaced all node and edge properties in use.

gglusman commented 2 weeks ago

@brettasmi and I are in direct talks right now on ingesting CTKP into imProving/SPOKE. :)

gglusman commented 2 weeks ago

So in that case, then why would we need this to be included in the Annotator? If the edges themselves are going to be served by ARAs through a connected KP, then the edges will hold the needed information, and we won't need to do anything with Annotator. Correct?

We identified two distinct use cases for providing information on CTs:

  1. Providing information connecting treatments to conditions, as derived from CTs - this is the domain of CTKP. Importantly, only a subset of all existing CTs will contribute to these edges.
  2. Providing basic information on CTs, when they come from any source (e.g., an NCTID is provided for any reason). This would require having information on all existing CTs. This is the domain of Annotator.
brettasmi commented 2 weeks ago

My guess is that it has always been this way. It probably just used to say 'treats', and the refactor has drawn attention to this.

@Genomewide, correct.

Can you elaborate for me on why we aren't able to find that clinical trial number at this time?

It's not that we can't find it, it's that SPOKE doesn't ingest it. Ingesting clinical trials via ChEMBL is imperfect, as their own ingestion isn't without error.

I don't want to speak on behalf of SPOKE's maintainers, but I can say that ingesting clinical trials data into SPOKE has been discussed many times internally. However, there are competing priorities with other data that may be more useful to SPOKE's users, who are not all as focused on drug development/repurposing as Translator's current audience.

@gglusman 's work presents a compelling and convenient way to get this data, so we are talking about how we might make that happen.

My understanding from the Architecture meeting on 7/2/24 was that KGs ingesting clinical trial edges via sources other than @gglusman's KG were planning (before the end of this Translator phase)

@sierra-moxon , by "end of this Translator phase," do you mean end of the Eel sprint or end of year?

sierra-moxon commented 2 weeks ago

@brettasmi - per the TACT call this morning, @mbrush will coordinate timelines for this. tagging him here.