opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Designing a public CRISPR widget #2928

Closed buniello closed 1 year ago

buniello commented 1 year ago

As we make progress with processing and interpreting CRISPR results from

Brain CRISPR #2821 BIOGRID #2920

we need to design a public CRISPR widget for the 23.06 release - keeping in mind that the PPP will host both a public and a pre-publication widget.

This issue will be completed with widget specifications once the schema for these datasets is finalised, as discussed with @DSuveges.

Evidence schema (example from Brain CRISPR):

{
  "studyId": "Glutamatergic Neuron-CellRox-CRISPRi",
  "datatypeId": "sysbio",
  "datasourceId": "crispr_screen",
  "projectId": "crispr_brain",
  "studyOverview": "CRISPRi FACS screen for reactive oxygen species in human iPSC-derived glutamatergic neurons.",
  "experiment": "WT CRISPRi-iPSCs were transduced with genome-wide CRISPRi-v2 sgRNA libraries. Two days after infection, the cells were selected for lentiviral integration using puromycin (1 g/mL) for 3 days as the cultures were expanded for the screens.",
  "cellType": "Glutamatergic Neuron",
  "contrast": null,
  "diseaseFromSource": "neurodegenerative disease, ageing",
  "diseaseFromSourceMappedId": "EFO_0005772",
  "targetFromSource": "MIS12",
  "pValue": 0.1350578615532632,
  "statisticalTestTail": "lower tail",
  "resourceScore": -3.5780208400785147
}
DSuveges commented 1 year ago

@buniello for your interest: I have reviewed the ot_crispr dataset schema and made a few changes to make sure the two are more similar. Now it looks like this:

{
  "studyId": "Glutamatergic Neuron-Survival-CRISPRi",
  "datatypeId": "sysbio",
  "datasourceId": "crispr_screen",
  "projectId": "crispr_brain",
  "crisprScreenLibrary": "Genome-wide CRISPRi-v2",
  "studyOverview": "CRISPRi survival screen in human iPSC-derived glutamatergic neurons (standard medium).",
  "cellType": "Glutamatergic Neuron",
  "literature": [
    "34031600"
  ],
  "diseaseFromSource": "essential genes / neurodegenerative disease",
  "diseaseFromSourceMappedId": "EFO_0005772",
  "contrast": "Survival",
  "targetFromSourceId": "SOD2",
  "resourceScore": 0.0029283011705045,
  "statisticalTestTail": "lower tail",
  "log2FoldChangeValue": -24.05642773885583
}
buniello commented 1 year ago

First draft of CRISPR Screens(public) widget - please see data schema above for reference

CS CRISPR Screens CRISPR knockout screens from public CRISPR datasources, associating "targetFromSourceId" and CRISPR results. Sources: "projectId" (link) e.g. CRISPRBrain (will be a comma separated list when data from more sources will be integrated)

Table Reported disease: "diseaseFromSource" (Link to “diseaseFromSourceMappedId” disease page - can be multiple links for multiple mappings) Study Identifier: “studyId" (Link link to CRISPRBrain study page: e.g. Glutamatergic Neuron-Survival-CRISPRi) - will have different identifiers from different sources when integrating more sources Contrast/Study overview: "contrast”/ "studyOverview" (tooltip: SCREEN LIBRARY: "crisprScreenLibrary") Cell type: "cellType" Log2 fold change: "log2FoldChangeValue" Significance: "resourceScore" (tooltip STATISTICAL TEST TAIL: "statisticalTestTail”) Source: "projectId" (link to source e.g. CRISPRBrain) - we plan to integrate data from multiple sources with time, so this link can vary depending from the source Publication: "literature" (link to EPMC similar to OT Genetics widget)

buniello commented 1 year ago

@LucaFumis: "datatypeId": "sysbio" Position of new data source on the "Association on the Fly" page: before Project Score (first data type for the sysbio aggregation)

Screenshot 2023-06-05 at 12 33 41
buniello commented 1 year ago

Thanks @LucaFumis for the first version of the CRISPR Screens widget - it looks good! Just a few minor comments, as already discussed on Slack - Contrast / Study overview : delete full stop at the end log2 fold change: format (i.e. trim) source: format it to CRISPBrain (same as link in the header) diseaseFromSource: when the record returns "essential genes / neurodegenerative disease" - visualise as neurodegenerative disease (essential genes) or neurodegenerative disease (tooltip: essential genes)

LucaFumis commented 1 year ago

Thank you @buniello - I've updated the preview with the changes we've discussed today and the new datasource: looks to be working correctly in both AOTF and classic associations. I have to fix the table download.

Screenshot 2023-06-06 at 22 13 52
buniello commented 1 year ago

In a second iteration (following releases) the CRISPRBrain widget/data need some cleaning-up within the reported disease field. Some examples of datasets needing cleaning up within the reported disease field - by expanding the reported diseases one by one - are:

https://platform.dev.opentargets.xyz/evidence/ENSG00000132639/GO_0007568 https://platform.dev.opentargets.xyz/evidence/ENSG00000176884/MONDO_0005180

This is for myself and Daniel to remember.