opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Generate gentropy-based L2G evidence #3453

Closed d0choa closed 1 month ago

d0choa commented 2 months ago

Data

The new GWAS credible set evidence will require the next fields:

The generation of this data needs to be implemented by taking the l2g_predictions and exploding the EFOs using the credible sets and study_index. For now, we will reproduce the previous L2G threshold of 0.05.

As discussed with @DSuveges, it would be good to eventually complete the evidence with more metadata to make it more readable using the pre-existing fields in the schema (e.g. studyId, pValue, etc.). This is only scoping the minimum set of data that MUST be there.

Platform ETL

To consume this data we will need to run the platform ETL (from evidence downwards). For now, l would run this evidence side-by-side with the current evidence, even though associations will be corrupted. Eventually we will drop the old evidence.

{
      id: "gwas_credible_sets",
      datatype-id: "genetic_association"
      unique-fields: [
        "studyLocusId",
      ],
      score-expr: "resourceScore"
    }
}

Note: If I remember correctly targetId and diseaseId are unique-fields by default, so this is just what else defines unicity.

Evidence API

New columns that need to be added to the evidence endpoint:

A follow up action will be to populate the top L2G column in the credible sets using this data. But we can scope that once this data has been loaded.

@ireneisdoomed could you manually hand-craft this dataset in JSONL for @jdhayhurst? I think we have been using studyLocusId from 22.06 so perhaps it's the less problematic one to use for now.

addramir commented 2 months ago

Is it going to be part of gentropy or orchestration? I guess this is going to be a dag?

d0choa commented 2 months ago

Ticket updated

ireneisdoomed commented 2 months ago

Data generation is blocked by https://github.com/opentargets/issues/issues/3555

prashantuniyal02 commented 1 month ago

Hi @ireneisdoomed, #3555 has been completed, so this task should be unblocked.

ireneisdoomed commented 1 month ago

Thank you @prashantuniyal02. I think I still need a new Gentropy data release to generate L2G results that align with the rest of the datasets.

DSuveges commented 1 month ago

Regarding the specification:

d0choa commented 1 month ago

Specification at the top updated based on discussion with @ireneisdoomed and @DSuveges