opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Crashing indication widget - Add extra sources of clinical information to UI #3348

Closed d0choa closed 1 week ago

d0choa commented 1 week ago

Describe the bug Dictionaries in the ot-ui-apps need to be updated to accommodate new clinical data sources available in 23.06

Observed behaviour The indication widget in DEV currently crashes for many drugs: https://platform.dev.opentargets.xyz/drug/CHEMBL1201580 https://platform.dev.opentargets.xyz/drug/CHEMBL4298209

but not for others: https://platform.dev.opentargets.xyz/drug/CHEMBL1162175

I think the reason why it's crashing is that the records in the widget contain additional values drug.indications.rows[].references[].source

In the next chunk of the response for indications for the drug CHEMBL4298209, there are sources (e.g. EMA) that I don't think are expected by the FE

  "references": [
      {
          "ids": [
              "label/2020/761158s000lbl.pdf"
          ],
          "source": "FDA",
      },
      {
          "ids": [
              "EMEA/H/C/004935"
          ],
          "source": "EMA",
      },
      {
          "ids": [
              "16a160a4-3ec0-4ddf-99ce-05912dd3382d"
          ],
          "source": "DailyMed",
      },

When looking at the codebase there are a few dictionaries that contain incomplete lists of sources. I guess some might not even be relevant

I don't know in which context these dictionaries are relevant but it's likely that Known Drugs / Clinical Precedence and/or evidence widgets are also misbehaving.

The UI dictionary needs to contain all the following sources (cc @ireneisdoomed):

Indication dataset

df.select(f.explode("indications").alias("expl")).select(f.explode("expl.references").alias("expl2")).groupBy("expl2.source").count().show()
+--------------+-----+
|        source|count|
+--------------+-----+
|          USAN| 1601|
|           EMA| 1163|
|           ATC| 3015|
|           INN|  421|
|      DailyMed| 5780|
|           FDA|  790|
|ClinicalTrials|53825|
+--------------+-----+

Evidence dataset

+--------------+------+
|      niceName| count|
+--------------+------+
|           EMA|  1915|
|           ATC|  4597|
|      DailyMed|102404|
|           FDA|  1780|
|ClinicalTrials|546989|
+--------------+------+

Out of the above... we would only want to change ClinicalTrials to ClinicalTrials.gov. The rest could stay the same as they are if this helps to bulletproof the fix for future changes.

d0choa commented 1 week ago

Evidence widget looks healthy:

Screenshot 2024-06-14 at 14 28 56

Juanmaria-rr commented 1 week ago

I have compared some associations from the evidences file I have access “gs://otar000-evidence_input/Genetics_portal/json/genetics-portal-evidence-2024-04-16.json.gz” between the dev platform version: Target IL5 and Asthma. In the evidences file I have 8 evidences with only 1 with coloc directionality. In the dev platform I see 10 with 3 evidences with directionality. https://platform.dev.opentargets.xyz/disease/MONDO_0004979/associations :x:

  1. TYK2 with Lymphocyte counts (the case reported by Luca stefanucci where we discussed how the directionality was calculated) This is corrected. Same number of evidences and same directionality that reported from the file. https://platform.dev.opentargets.xyz/disease/EFO_0004587/associations :heavy_check_mark:
  2. HMGCR and hypercholesterolemia. Corrected directionalities in comparison with the 24.03 platform version. Same number of evidences and same directionality that reported f rom the file. https://platform.opentargets.org/disease/HP_0003124/associations :heavy_check_mark: In the first case, I do not fully understand why there are less evidences in the evidence file than in the dev platform version.
prashantuniyal02 commented 1 week ago

Hi @ireneisdoomed , the FE teams needed url links for the following source in the indication widget to update the dictionary:

Could you please help with this?

ireneisdoomed commented 1 week ago

Overall, it'd be good to simply link to the same resource they use.

1. INN A constant https://www.who.int/publications/m/item/inn-pl-126

2. EMA. Not trivial, the reference is full in the evidence set but not here. For CHEMBL4297551:

It's easy to find the EMA page if you google the product number, but they're indexed by their commercial name. There is not a reliable and direct way to build this link for us. I have 3 suggestions in order of preference:

3. USAN Simpler case, the drug finder is indexed by USAN name. This name is the first element of {indications.rows.references.ids}. So something like this: https://searchusan.ama-assn.org/finder/usan/search/{indications.rows.references.ids[0]} /relevant/1/

prashantuniyal02 commented 1 week ago

This has been resolved with the latest release.