opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Co-loc widgets in credible set page #3510

Open buniello opened 1 month ago

buniello commented 1 month ago

As mentioned in 3436, the credible set page will also contain:

  1. Co-loc with GWAS SNPs - scoping in progress
  2. Co-loc with QTLs - scoping in progress

We shaped these two widgets on 11/09/24 (notes with me). When the colocalisation API will be ready, i will add instruction for FE widget development here.

Current data schema for colocalisation is:

root
 |-- leftStudyLocusId: long (nullable = false)
 |-- rightStudyLocusId: long (nullable = false)
 |-- chromosome: string (nullable = false)
 |-- colocalisationMethod: string (nullable = false)
 |-- numberColocalisingVariants: long (nullable = false)
 |-- h0: double (nullable = true)
 |-- h1: double (nullable = true)
 |-- h2: double (nullable = true)
 |-- h3: double (nullable = true)
 |-- h4: double (nullable = true)
 |-- log2h4h3: double (nullable = true)
 |-- clpp: double (nullable = true)

[placeholder comment]

Originally posted by @buniello in https://github.com/opentargets/issues/issues/3436#issuecomment-2352878666

buniello commented 1 month ago

Subheader: which GWAS studies colocalise with this credible set? Source: Open Targets

E.g For a fixed `studyLocusId` (we are in credible set page) ``` query CredibleSetWithColocalisation { credibleSets(studyLocusIds: ["4996263116703201782"]) { variant { id } study { projectId studyType traitFromSource biosampleFromSourceId nSamples summarystatsLocation hasSumstats initialSampleSize publicationJournal publicationDate nControls pubmedId publicationFirstAuthor publicationTitle nCases } studyLocusId colocalisation(studyTypes: [gwas, eqtl, tuqtl]) { otherStudyLocus { study { studyId projectId studyType traitFromSource biosampleFromSourceId nSamples summarystatsLocation hasSumstats initialSampleSize publicationJournal publicationDate nControls pubmedId publicationFirstAuthor publicationTitle nCases } variant { id } studyLocusId study { studyId projectId studyType traitFromSource biosampleFromSourceId nSamples summarystatsLocation hasSumstats initialSampleSize publicationJournal publicationDate nControls pubmedId publicationFirstAuthor publicationTitle nCases } variant { id } } chromosome rightStudyType numberColocalisingVariants colocalisationMethod clpp h3 h4 } } } ```
Response ``` { "data": { "credibleSets": [ { "variant": { "id": "18_60184226_G_GCCTCCCCCTCACCAAACTTAA" }, "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Leg fat mass and leg lean mass (pleiotropy)", "biosampleFromSourceId": null, "nSamples": 12517, "summarystatsLocation": null, "hasSumstats": false, "initialSampleSize": "9,684 European ancestry individuals, 1,541 Han Chinese ancestry individuals, 847 African American individuals, 445 Hispanic individuals", "publicationJournal": "Int J Obes (Lond)", "publicationDate": "2020-07-27", "nControls": 0, "pubmedId": "32719433", "publicationFirstAuthor": "Liu Y", "publicationTitle": "Four pleiotropic loci associated with fat mass and lean mass.", "nCases": 0 }, "studyLocusId": "4996263116703201782", "colocalisation": [ { "otherStudyLocus": null, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 61, "colocalisationMethod": "eCAVIAR", "clpp": 0.004185303183867549, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Obesity", "biosampleFromSourceId": null, "nSamples": 2796, "summarystatsLocation": null, "hasSumstats": false, "initialSampleSize": "695 European ancestry adult cases, 731 European ancestry adult controls, 685 European ancestry child cases, 685 European ancestry child controls", "publicationJournal": "Nat Genet", "publicationDate": "2009-01-18", "nControls": 1416, "pubmedId": "19151714", "publicationFirstAuthor": "Meyre D", "publicationTitle": "Genome-wide association study for early-onset and morbid adult obesity identifies three new risk loci in European populations.", "nCases": 1380 }, "variant": { "id": "18_60183864_T_C" }, "studyLocusId": "8446085221407731539" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 53, "colocalisationMethod": "eCAVIAR", "clpp": 0.02804825279122239, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Lung function (FVC)", "biosampleFromSourceId": null, "nSamples": 372000, "summarystatsLocation": null, "hasSumstats": false, "initialSampleSize": "approximately 372,000 European ancestry individuals", "publicationJournal": "Am J Hum Genet", "publicationDate": "2018-12-27", "nControls": 0, "pubmedId": "30595370", "publicationFirstAuthor": "Kichaev G", "publicationTitle": "Leveraging Polygenic Functional Enrichment to Improve GWAS Power.", "nCases": 0 }, "variant": { "id": "18_60079983_G_T" }, "studyLocusId": "8922755842398394081" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 60, "colocalisationMethod": "eCAVIAR", "clpp": 0.0000821875518118383, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Cholesteryl ester levels in large HDL", "biosampleFromSourceId": null, "nSamples": 115082, "summarystatsLocation": "ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90092001-GCST90093000/GCST90092846/harmonised/35213538-GCST90092846-EFO_0010351.h.tsv.gz", "hasSumstats": true, "initialSampleSize": "115,082 European ancestry individuals", "publicationJournal": "PLoS Biol", "publicationDate": "2022-02-25", "nControls": 0, "pubmedId": "35213538", "publicationFirstAuthor": "Richardson TG", "publicationTitle": "Characterising metabolomic signatures of lipid-modifying therapies through drug target mendelian randomisation.", "nCases": 0 }, "variant": { "id": "18_60200293_A_G" }, "studyLocusId": "6255895796712074654" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 133, "colocalisationMethod": "eCAVIAR", "clpp": 0.0031026663546615853, "h3": null, "h4": null }, { "otherStudyLocus": null, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 20, "colocalisationMethod": "eCAVIAR", "clpp": 0.0015199423131325505, "h3": null, "h4": null }, { "otherStudyLocus": null, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 16, "colocalisationMethod": "eCAVIAR", "clpp": 0.0017335451416206373, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Leg fat free mass right (UKB data field 23113)", "biosampleFromSourceId": null, "nSamples": 174488, "summarystatsLocation": "ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90309001-GCST90310000/GCST90309844/harmonised/GCST90309844.h.tsv.gz", "hasSumstats": true, "initialSampleSize": "174,488 European ancestry individuals", "publicationJournal": "Commun Biol", "publicationDate": "2024-02-13", "nControls": 0, "pubmedId": "38351177", "publicationFirstAuthor": "Jung H", "publicationTitle": "Integration of risk factor polygenic risk score with disease polygenic risk score for disease prediction.", "nCases": 0 }, "variant": { "id": "18_60183864_T_C" }, "studyLocusId": "8764954494484738570" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 19, "colocalisationMethod": "eCAVIAR", "clpp": 0.006417429663558727, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Phospholipid levels in large HDL", "biosampleFromSourceId": null, "nSamples": 115082, "summarystatsLocation": "ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90092001-GCST90093000/GCST90092852/harmonised/35213538-GCST90092852-EFO_0004639.h.tsv.gz", "hasSumstats": true, "initialSampleSize": "115,082 European ancestry individuals", "publicationJournal": "PLoS Biol", "publicationDate": "2022-02-25", "nControls": 0, "pubmedId": "35213538", "publicationFirstAuthor": "Richardson TG", "publicationTitle": "Characterising metabolomic signatures of lipid-modifying therapies through drug target mendelian randomisation.", "nCases": 0 }, "variant": { "id": "18_60194728_G_A" }, "studyLocusId": "6928462624646614780" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 70, "colocalisationMethod": "eCAVIAR", "clpp": 0.0022083558108185975, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Arm fat percentage right (UKB data field 23119)", "biosampleFromSourceId": null, "nSamples": 174488, "summarystatsLocation": "ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90309001-GCST90310000/GCST90309850/harmonised/GCST90309850.h.tsv.gz", "hasSumstats": true, "initialSampleSize": "174,488 European ancestry individuals", "publicationJournal": "Commun Biol", "publicationDate": "2024-02-13", "nControls": 0, "pubmedId": "38351177", "publicationFirstAuthor": "Jung H", "publicationTitle": "Integration of risk factor polygenic risk score with disease polygenic risk score for disease prediction.", "nCases": 0 }, "variant": { "id": "18_60185354_T_C" }, "studyLocusId": "5355542411595000245" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 22, "colocalisationMethod": "eCAVIAR", "clpp": 0.034534402933972065, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Predicted visceral adipose tissue", "biosampleFromSourceId": null, "nSamples": 325153, "summarystatsLocation": null, "hasSumstats": false, "initialSampleSize": "325,153 British ancestry individuals", "publicationJournal": "Nat Med", "publicationDate": "2019-09-09", "nControls": 0, "pubmedId": "31501611", "publicationFirstAuthor": "Karlsson T", "publicationTitle": "Contribution of genetics to visceral adiposity and its relation to cardiovascular and metabolic disease.", "nCases": 0 }, "variant": { "id": "18_60183189_G_T" }, "studyLocusId": "5863480652200597925" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 19, "colocalisationMethod": "eCAVIAR", "clpp": 0.004189601709838443, "h3": null, "h4": null }, { "otherStudyLocus": null, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 145, "colocalisationMethod": "eCAVIAR", "clpp": 0.016385510969953995, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Cholesterol esters in large HDL", "biosampleFromSourceId": null, "nSamples": 136016, "summarystatsLocation": null, "hasSumstats": false, "initialSampleSize": "4,435 East Asian ancestry individuals, 11,340 South Asian ancestry individuals, 120,241 European ancestry individuals", "publicationJournal": "Nature", "publicationDate": "2024-03-06", "nControls": 0, "pubmedId": "38448586", "publicationFirstAuthor": "Karjalainen MK", "publicationTitle": "Genome-wide characterization of circulating metabolic biomarkers.", "nCases": 0 }, "variant": { "id": "18_60187461_C_T" }, "studyLocusId": "5039269798320777922" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 91, "colocalisationMethod": "eCAVIAR", "clpp": 0.029046792876845952, "h3": null, "h4": null }, { "otherStudyLocus": null, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 12, "colocalisationMethod": "eCAVIAR", "clpp": 0.0007052659982711487, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Feeling worry", "biosampleFromSourceId": null, "nSamples": 372869, "summarystatsLocation": "ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST006001-GCST007000/GCST006950/harmonised/29500382-GCST006950-EFO_0009589.h.tsv.gz", "hasSumstats": true, "initialSampleSize": "372,869 European ancestry individuals", "publicationJournal": "Nat Commun", "publicationDate": "2018-03-02", "nControls": 0, "pubmedId": "29500382", "publicationFirstAuthor": "Nagel M", "publicationTitle": "Item-level analyses reveal genetic heterogeneity in neuroticism.", "nCases": 0 }, "variant": { "id": "18_60182196_C_T" }, "studyLocusId": "7162798918931343623" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 178, "colocalisationMethod": "eCAVIAR", "clpp": 0.05067786564969157, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Ankle spacing width left (UKB data field 4100)", "biosampleFromSourceId": null, "nSamples": 174488, "summarystatsLocation": "ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90309001-GCST90310000/GCST90309913/harmonised/GCST90309913.h.tsv.gz", "hasSumstats": true, "initialSampleSize": "174,488 European ancestry individuals", "publicationJournal": "Commun Biol", "publicationDate": "2024-02-13", "nControls": 0, "pubmedId": "38351177", "publicationFirstAuthor": "Jung H", "publicationTitle": "Integration of risk factor polygenic risk score with disease polygenic risk score for disease prediction.", "nCases": 0 }, "variant": { "id": "18_60099280_A_G" }, "studyLocusId": "6357372367343057969" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 68, "colocalisationMethod": "eCAVIAR", "clpp": 0.00022160267397273398, "h3": null, "h4": null }, { "otherStudyLocus": null, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 117, "colocalisationMethod": "eCAVIAR", "clpp": 0.017256222631633948, "h3": null, "h4": null }, { "otherStudyLocus": null, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 92, "colocalisationMethod": "eCAVIAR", "clpp": 0.02415697545154131, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Body mass index", "biosampleFromSourceId": null, "nSamples": 532396, "summarystatsLocation": "ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90029001-GCST90030000/GCST90029007/harmonised/29892013-GCST90029007-EFO_0004340.h.tsv.gz", "hasSumstats": true, "initialSampleSize": "532,396 European ancestry individuals", "publicationJournal": "Nat Genet", "publicationDate": "2018-07-01", "nControls": 0, "pubmedId": "29892013", "publicationFirstAuthor": "Loh PR", "publicationTitle": "Mixed-model association for biobank-scale datasets.", "nCases": 0 }, "variant": { "id": "18_60161902_T_C" }, "studyLocusId": "5937172987187647901" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 16, "colocalisationMethod": "eCAVIAR", "clpp": 0.001396207182704726, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Age of smoking initiation", "biosampleFromSourceId": null, "nSamples": 618541, "summarystatsLocation": null, "hasSumstats": false, "initialSampleSize": "618,541 European ancestry individuals", "publicationJournal": "Nature", "publicationDate": "2022-12-07", "nControls": 0, "pubmedId": "36477530", "publicationFirstAuthor": "Saunders GRB", "publicationTitle": "Genetic diversity fuels gene discovery for tobacco and alcohol use.", "nCases": 0 }, "variant": { "id": "18_60183189_G_T" }, "studyLocusId": "6964903310688346810" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 23, "colocalisationMethod": "eCAVIAR", "clpp": 0.02606804637391314, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "HDL cholesterol", "biosampleFromSourceId": null, "nSamples": 94595, "summarystatsLocation": "ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST002001-GCST003000/GCST002223/harmonised/24097068-GCST002223-EFO_0004612.h.tsv.gz", "hasSumstats": true, "initialSampleSize": "94,595 European ancestry individuals", "publicationJournal": "Nat Genet", "publicationDate": "2013-10-06", "nControls": 0, "pubmedId": "24097068", "publicationFirstAuthor": "Willer CJ", "publicationTitle": "Discovery and refinement of loci associated with lipid levels.", "nCases": 0 }, "variant": { "id": "18_60161902_T_C" }, "studyLocusId": "7452645057410397991" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 94, "colocalisationMethod": "eCAVIAR", "clpp": 0.023690505947561523, "h3": null, "h4": null }, { "otherStudyLocus": null, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 21, "colocalisationMethod": "eCAVIAR", "clpp": 0.002756654243010316, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Anxiety", "biosampleFromSourceId": null, "nSamples": 136957, "summarystatsLocation": null, "hasSumstats": false, "initialSampleSize": "at least 23,606 European ancestry cases, at least 113,351 European ancestry controls", "publicationJournal": "Nat Hum Behav", "publicationDate": "2021-04-15", "nControls": 113351, "pubmedId": "33859377", "publicationFirstAuthor": "Thorp JG", "publicationTitle": "Symptom-level modelling unravels the shared genetic architecture of anxiety and depression.", "nCases": 23606 }, "variant": { "id": "18_60181136_A_T" }, "studyLocusId": "7888940741429223779" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 81, "colocalisationMethod": "eCAVIAR", "clpp": 0.025302298354386125, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "HDL cholesterol levels", "biosampleFromSourceId": null, "nSamples": 127326, "summarystatsLocation": null, "hasSumstats": false, "initialSampleSize": "89,893 European ancestry individuals, 20,989 African American individuals, 12,450 Asian ancestry individuals, 3,994 HIspanic individuals", "publicationJournal": "Am J Epidemiol", "publicationDate": "2019-01-29", "nControls": 0, "pubmedId": "30698716", "publicationFirstAuthor": "de Vries PS", "publicationTitle": "Multi-Ancestry Genome-Wide Association Study of Lipid Levels Incorporating Gene-Alcohol Interactions.", "nCases": 0 }, "variant": { "id": "18_60182196_C_T" }, "studyLocusId": "8067450662469889832" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 50, "colocalisationMethod": "eCAVIAR", "clpp": 0.047512211287394086, "h3": null, "h4": null }, { "otherStudyLocus": null, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 21, "colocalisationMethod": "eCAVIAR", "clpp": 0.005725751575785604, "h3": null, "h4": null }, { "otherStudyLocus": { "study": { "projectId": "GCST", "studyType": "gwas", "traitFromSource": "Triglyceride levels (UKB data field 30870)", "biosampleFromSourceId": null, "nSamples": 389562, "summarystatsLocation": "ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90014001-GCST90015000/GCST90014014/harmonised/34017140-GCST90014014-EFO_0004530.h.tsv.gz", "hasSumstats": true, "initialSampleSize": "389,562 British ancestry individuals", "publicationJournal": "Nat Genet", "publicationDate": "2021-05-20", "nControls": 0, "pubmedId": "34017140", "publicationFirstAuthor": "Mbatchou J", "publicationTitle": "Computationally efficient whole-genome regression for quantitative and binary traits.", "nCases": 0 }, "variant": { "id": "18_60071840_C_T" }, "studyLocusId": "5866546903464061782" }, "chromosome": "18", "rightStudyType": "gwas", "numberColocalisingVariants": 96, "colocalisationMethod": "eCAVIAR", "clpp": 0.00006126950507720074, "h3": null, "h4": null } ] } ] } } ```

For each colocalisation object with rightStudyType: gwas (in API query example above)

TABLE (similar to current OTG):

  1. View Credible Set tab (linking out to credible set page for each of the co-localising cred sets - otherStudyLocus within colocalisation object).
  2. Study: studyid (linking out to its study page)
  3. Trait: traitFromSource - hyperlink - I don't see mapped trait field within study object there?
  4. Author: publicationFirstAuthor hyperlinked to https://pubmed.ncbi.nlm.nih.gov/`pubmedId`/
  5. Lead Variant: variantid
  6. p-value: `pvalue' (sorting column)
  7. Colocalising Variants (n): numberColocalisingVariants
  8. Colocalisation Method: colocalisationMethod
  9. H3: h3 Tootlip: Posterior probability that the signals do not colocalise
  10. H4: h4 Tootlip: Posterior probability that the signals colocalise
  11. CLPP: clpp Tooltip: eCAVIAR colocalisation posterior probability
buniello commented 1 month ago

Subheader: which molQTls colocalise with this credible set? Source: Open Targets

For each colocalisation object with rightStudyType: qtl

[pending example API query for gwas/qtl coloc here]

TABLE (similar to current OTG):

  1. View Credible Set tab (linking out to credible set page for each of the co-localising cred sets - otherStudyLocus within colocalisation object).
  2. Gene: targetname hyperlinked totargetId (linking out to gene page)
  3. Study: studyId(linking out to its study page)
  4. Molecular Trait: traitFromSource [from current OTG, tbd if stays here as well]
  5. Affected tissue/cell: biosampleFromSourceId hyperlinked to OLS - [should be name hyperlinked to id in final version]
  6. QTL type: studyType
  7. Condition: not in API yet
  8. Lead Variant: variantid
  9. p-value: 'pvalue'
  10. Colocalising Variants (n): numberColocalisingVariants
  11. Colocalisation Method: colocalisationMethod
  12. H3: h3 Tootlip: Posterior probability that the signals do not colocalise
  13. H4: h4 Tootlip: Posterior probability that the signals colocalise
  14. CLPP: clpp Tooltip: eCAVIAR colocalization posterior probability[do we need source/project ID as well? - or is it redundant if we have 'projectID`]?