MatsDahlberg / clinicalDB

0 stars 0 forks source link

Compounds #21

Closed henrikstranneheim closed 10 years ago

henrikstranneheim commented 10 years ago
  1. Use the Compounds column to decide if a variant is a compound or not (="-").
  2. Extract compound score from string on format:

chr_start_ref-allel_alt-allel=Compound Score:

for example 17_78184393_C_T=28:17_78187720_A_G=22

MatsDahlberg commented 10 years ago

Jag har lagt till att http://clinical-db:8082/variants/23548020 returnerar alla compounds som hör till varianten.

Är det du menar?

{
    "rating": "", 
    "location_reliability": "", 
    "scaled_c_score_snv": 22.8, 
    "functional_annotation": "nonsynonymous SNV", 
    "stop_bp": 75490896, 
    "dbsnp129": null, 
    "lrt_whole_exome": 2e-06, 
    "omim_morbid_desc": "", 
    "rank_score": 27, 
    "hgnc_synonyms": "", 
    "phast_const_elements": "Score=668;Name=lod=690", 
    "snorna_mirna_annotation": "-", 
    "chr": "chr5", 
    "hgnc_transcript_id": "SV2C:NM_014979:exon3:c.733C>T:p.L245F,", 
    "esp6500": null, 
    "hgnc_symbol": "SV2C", 
    "GT_call_filter": "PASS", 
    "pseudogene": "", 
    "clinical_db_gene_annotation": "NO", 
    "main_location": "", 
    "dbsnp132": null, 
    "disease_group": "", 
    "ref_nt": "C", 
    "mutation_taster": 0.991596, 
    "hgmd_accession": null, 
    "hgmd": "", 
    "gene_annotation": "exonic", 
    "omim_gene_desc": "", 
    "start_bp": 75490896, 
    "hgmd_variant_type": null, 
    "otherVariants": [], 
    "dbsnp": "-", 
    "sift_whole_exome": 0.01, 
    "id": 23548020, 
    "gerp_element": 909.8, 
    "dbsnp_id": "-", 
    "compounds": [
        {
            "alt_nt": "T", 
            "start_bp": 75427518, 
            "chr": "chr5", 
            "variant": 16159878, 
            "ref_nt": "C"
        }, 
        {
            "alt_nt": "A", 
            "start_bp": 75427935, 
            "chr": "chr5", 
            "variant": 16159879, 
            "ref_nt": "G"
        }, 
        {
            "alt_nt": "C", 
            "start_bp": 75470086, 
            "chr": "chr5", 
            "variant": 16159880, 
            "ref_nt": "T"
        }, 
        {
            "alt_nt": "T", 
            "start_bp": 75490892, 
            "chr": "chr5", 
            "variant": 16159881, 
            "ref_nt": "C"
        }
    ], 
    "scaled_c_score_1000g": null, 
    "hgmd_variant_pmid": null, 
    "polyphen_var_human": 0.97, 
    "other_location": "", 
    "disease_gene_model": null, 
    "hbvdb": null, 
    "alt_nt": "T", 
    "unscaled_c_score_1000g": null, 
    "phylop_whole_exome": 2.617, 
    "expression_type": "", 
    "gene_model": "AR_compound:AD_denovo", 
    "variant_count": null, 
    "individual_rank_score": 27, 
    "unscaled_c_score_snv": 4.33376, 
    "thousand_g": null, 
    "ensembl_geneid": "ENSG00000122012;", 
    "hgnc_approved_name": "synaptic vesicle glycoprotein 2C;", 
    "genomic_super_dups": "-", 
    "polyphen_div_human": 0.996, 
    "gerp_whole_exome": 5.27
}
henrikstranneheim commented 10 years ago

Precis.

Sen måste MAXcompound scoren ("=XX") för varje variant lyftas ut från strängen och bli globalt sorterbar så att vi kan göra en filtrering på det bästa High scoring compound pair.

"variant": 16159880, är det ett internt ID?

MatsDahlberg commented 10 years ago

Ja, variant är den interna nyckeln till varianten. Den behövs så att Robin kan göra en länk till varianten.

Jag glömde att ta med combined_score, så här ser compounds entryt ut nu:

    "compounds": [
        {
            "alt_nt": "T", 
            "ref_nt": "C", 
            "variant": 16159881, 
            "combined_score": 42, 
            "chr": "chr5", 
            "start_bp": 75490892
        }, 
        {
            "alt_nt": "A", 
            "ref_nt": "G", 
            "variant": 16159879, 
            "combined_score": 28, 
            "chr": "chr5", 
            "start_bp": 75427935
        }, 
        {
            "alt_nt": "T", 
            "ref_nt": "C", 
            "variant": 16159878, 
            "combined_score": 24, 
            "chr": "chr5", 
            "start_bp": 75427518
        }, 
        {
            "alt_nt": "C", 
            "ref_nt": "T", 
            "variant": 16159880, 
            "combined_score": 24, 
            "chr": "chr5", 
            "start_bp": 75470086
        }
    ], 
henrikstranneheim commented 10 years ago

Kanon!

Gör databas parsern grovjobbet med sortering eller gör front-enden det? dvs ska jag engagera Robin?

MatsDahlberg commented 10 years ago

Databasen levererar compoundsen i en sådan ordning så att den med högst combined_score kommer först.

MatsDahlberg commented 10 years ago

Nu finns en service som ger alla compounds som tillhör en varint: http://clinical-db.scilifelab.se:8082/compounds/39461415

Som ger:

[
    {
        "alt_nt": "A", 
        "ref_nt": "G", 
        "variant": 37432388, 
        "combined_score": 25, 
        "chr": "chr4", 
        "start_bp": 982852
    }, 
    {
        "alt_nt": "A", 
        "ref_nt": "C", 
        "variant": 37432391, 
        "combined_score": 25, 
        "chr": "chr4", 
        "start_bp": 980896
    }, 
    {
        "alt_nt": "A", 
        "ref_nt": "G", 
        "variant": 37432389, 
        "combined_score": 24, 
        "chr": "chr4", 
        "start_bp": 980932
    }, 
    {
        "alt_nt": "C", 
        "ref_nt": "G", 
        "variant": 37432390, 
        "combined_score": 21, 
        "chr": "chr4", 
        "start_bp": 984870
    }
]