NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

Daily Med Links in EPC are Duplicative #746

Closed sstemann closed 6 months ago

sstemann commented 7 months ago

Run MVP 1 for Common Cold in Test https://ui.test.transltr.io/main/results?l=Common%20Cold&i=MONDO:0005709&t=0&r=0&q=52650956-f0dd-43f8-a7df-7b60d6f0cb29 Expand the paths for Aspirin In the Lookup Path > select Treats > Miscellaneous There are five daily med pairs

image

Also note: Clinicals Trials are linked on both the Clinical Trials tab and the Miscellaneous tabs

image

ARAX and ARAGORN returns the same DailyMed links twice in the same result, sometimes both ARAs return the same EPC twice:

  1. Primary KS: Chembl, Aggregator: mychem-info, aggregator: service-provider-trapi image

  2. Primary KS: Chembl, Aggregator: molepro, aggregator: arax/aragorn

    image
edeutsch commented 7 months ago

Am I tagged because some behavior change is expected from ARAX? This seems like a merging/UI issue, right? The same information is arriving via different paths and it contains the same dailymed attributes. It is the same CHEMBL record that has the information, but it arrives via molepro or BTE or ARAX or Aragorn. Which I think is all good, right? It just needs some refinement at the merging or UI level?

sstemann commented 7 months ago

@edeutsch I guess I was looking for confirmation that this is expected behavior - to return two triples with identical primary knowledge sources for the same result. I'm not sure I understand what value is added to the response when a different aggregator aggregates the same primary knowledge source via the same triple. and/or if this affects scoring if the same evidence is taken into account multiple times.

edeutsch commented 7 months ago

I am uncertain if we have developed a policy on this. I think ARAX does not do the coalescing, under the assumption that the coalescing happens at the ARS/merging level.

Maybe we should develop some policies around this, using this nice example. Or remind ourselves (or me) is there are already policies.

gprice1129 commented 7 months ago

@sstemann it appears that the links in the screenshot are not exact duplicates. One has a protocol of http and the other has a protocol of https. However given that semantically they are the same we have decided to convert all http links to https which should remove the dailymed duplicates.

The clinical trials show up in in both tabs because we are given both the clinical trial ID and the link to the clinical trial by different sources. We will resolve this on our end by converting the clinical trial link to an ID.

I'll message again once the change is live and can be tested.

gprice1129 commented 6 months ago

I couldn't prove this was fixed in CI because we're getting different paths back that do not have this example. I was able to prove this was fixed with the specific link. Closing.