opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Serialise COVID-19 table as JSON #1076

Closed AsierGonzalez closed 4 years ago

AsierGonzalez commented 4 years ago

Users have requested that the OT table with target information related to COVID-19 (see #1032) is made available as JSON.

AsierGonzalez commented 4 years ago

First attempt by @DSuveges was to serialise the pandas dataframe directly as JSON but this does not work as expected for a number of columns that have lists in there and that are saved as strings by default. We're now working on a better solution that requires making some changes to the parsers and the integrator

AsierGonzalez commented 4 years ago

PR #25 fixes this. The columns that contain lists are now JSON lists as seen in the example below:

{
  "ensembl_id": "ENSG00000196639",
  "biotype": "protein_coding",
  "name": "HRH1",
  "description": "histamine receptor H1 [Source:HGNC Symbol;Acc:HGNC:5182]",
  "uniprot_ids": "C9J2E6,P35367",
  "COVID-19 UniprotKB": false,
  "Implicated_in_viral_infection": null,
  "Covid_direct_interactions": null,
  "Covid_indirect_interactions": null,
  "is_abundance_reg_on_covid": null,
  "abundance_reg_on_covid": null,
  "invitro_covid_activity": "LORATADINE weakly active and cytotoxic (ELLINGER);CETIRIZINE inactive (TOURET);DOXYLAMINE inactive (TOURET);DESLORATADINE inactive (TOURET);OLOPATADINE inactive (TOURET);PHENIRAMINE inactive (TOURET);DIMENHYDRINATE highly active (TOURET);TRIPELENNAMINE highly active (TOURET);ANTAZOLINE inactive (TOURET);METHAPYRILENE inactive (TOURET);DIPHENYLPYRALINE inactive (TOURET);LEVOCABASTINE inactive (TOURET);MECLIZINE inactive (TOURET);CLEMASTINE inactive (TOURET);TERFENADINE inactive (TOURET);BETAHISTINE inactive (TOURET);ASTEMIZOLE inactive (TOURET);CINNARIZINE inactive (TOURET);CHLORPHENIRAMINE inactive (TOURET);PYRILAMINE inactive (TOURET);CYPROHEPTADINE inactive (TOURET);KETOTIFEN inactive (TOURET);EMEDASTINE inactive (TOURET);AZELASTINE inactive (TOURET);PROMETHAZINE inactive (TOURET);TRIMIPRAMINE inactive (TOURET);CYCLIZINE inactive (TOURET);DIPHENHYDRAMINE inactive (TOURET);TOLAZOLINE inactive (TOURET);BROMPHENIRAMINE inactive (TOURET);METHYLPROMAZINE inactive (TOURET);TRIPROLIDINE inactive (TOURET);CARBINOXAMINE inactive (TOURET);HYDROXYZINE inactive (TOURET);HISTAMINE inactive (TOURET);ORPHENADRINE inactive (TOURET);FEXOFENADINE inactive (TOURET);AZATADINE inactive (TOURET);LORATADINE inactive (TOURET);PROMETHAZINE highly active (WESTON);ASTEMIZOLE highly active (RIVA)",
  "has_invitro_covid_activity": "5/41",
  "has_drug_in_covid_trials": null,
  "drugs_in_covid_trials": null,
  "max_phase": 4,
  "drugs_in_clinic": 46,
  "hpa_subcellular_location": [
    "Plasma membrane",
    "Cytosol"
  ],
  "hpa_rna_tissue_distribution": "Detected in many",
  "hpa_rna_tissue_specificity": "Low tissue specificity",
  "hpa_rna_specific_tissues": null,
  "respiratory_system_is_expressed": true,
  "respiratory_system_expressed_tissue_list": [
    "lung",
    "tonsil"
  ],
  "immune_system_is_expressed": true,
  "immune_system_expressed_tissue_list": [
    "alternatively activated macrophage",
    "bone marrow",
    "CD14-positive, CD16-negative classical monocyte",
    "CD34-negative, CD41-positive, CD42-positive megakaryocyte cell",
    "CD4-positive, alpha-beta T cell",
    "CD4-positive, alpha-beta thymocyte",
    "CD8-positive, alpha-beta T cell",
    "central memory CD4-positive, alpha-beta T cell",
    "class switched memory B cell",
    "conventional dendritic cell",
    "EBV-transformed lymphocyte",
    "effector memory CD4-positive, alpha-beta T cell",
    "granulocyte monocyte progenitor cell",
    "inflammatory macrophage",
    "leukocyte",
    "lymph node",
    "macrophage",
    "mature neutrophil",
    "memory B cell",
    "neutrophilic metamyelocyte",
    "segmented neutrophil of bone marrow",
    "small intestine Peyer's patch",
    "spleen",
    "tonsil"
  ],
  "has_safety_risk": true,
  "safety_info_source": [
    "known_target_safety",
    "experimental_toxicity"
  ],
  "safety_organs_systems_affected": [
    "immune system",
    "cardiovascular system",
    "nervous system",
    "gastrointestinal system"
  ],
  "Tractability_Top_bucket_(sm)": "Targets with drugs in phase IV",
  "Tractability_Top_bucket_(ab)": "Targets located in the plasma membrane ",
  "Tractability_Top_bucket_(other)": null,
  "scientificName": "Homo sapiens",
  "FILTER_network": false,
  "FILTER_network+drug": false,
  "FILTER_network+covid_tests": true
}