statgen / locuszoom

A Javascript/d3 embeddable plugin for interactively visualizing statistical genetic data from customizable sources.
https://statgen.github.io/locuszoom/
MIT License
156 stars 29 forks source link

Remove dependence on exAC #174

Closed abought closed 4 years ago

abought commented 4 years ago

Certain tooltip links and gene constraint table features depend on exac, which will be decommissioned soon. We should remove all references to exac in tooltips and identify replacement data sources soon.

abought commented 4 years ago

Notes so far.

Delving into this one: it doesn't appear that BRAVO presently has this information.

The recommended solution is GNOMAD, whose GraphQL API seems to contain the info we need, but is not "official" yet.

The main barrier here is that the example queries I have found so far only request data for one gene at a time; we'd like a batch ("all genes in region"). On exac, for example, we fetched all data in a single POST request from a list of gene IDs.

Breaking out a gene page, here is a query fragment that gets constraint data for one gene:

https://gnomad.broadinstitute.org/api/

query Gene($geneId: String, $geneSymbol: String, $referenceGenome: ReferenceGenomeId!) {
    gene(gene_id: $geneId, gene_symbol: $geneSymbol, reference_genome: $referenceGenome) {
        gnomad_constraint {
            exp_lof
            exp_mis
            exp_syn
            obs_lof
            obs_mis
            obs_syn
            oe_lof
            oe_lof_lower
            oe_lof_upper
            oe_mis
            oe_mis_lower
            oe_mis_upper
            oe_syn
            oe_syn_lower
            oe_syn_upper
            lof_z
            mis_z
            syn_z
            pLI
        }
    }
}

with variables:

{
    "geneId": "ENSG00000227560",
    "referenceGenome": "GRCh37"
}

We'd like to do this programmatically with the fewest number of queries possible. We should follow up on this after the thanksgiving holiday.

abought commented 4 years ago

Initial exploration suggests no bulk API is available. We've removed the gene constraint table, but hope to add it back in the future as APIs evolve. (user can see this data in gnomAD via a link)

abought commented 4 years ago

The gnomAD help desk has suggested a new query syntax to help with this problem. See below. We will need to evaluate this and implement in a data source.

https://gnomad.broadinstitute.org/api?query=%7B%0A%20%20gene1%3A%20gene(gene_symbol%3A%20%22PCSK9%22%2C%20reference_genome%3A%20GRCh37)%20%7B%0A%20%20%20%20gnomad_constraint%20%7B%0A%20%20%20%20%20%20exp_lof%0A%20%20%20%20%20%20obs_lof%0A%20%20%20%20%20%20oe_lof%0A%20%20%20%20%20%20oe_lof_lower%0A%20%20%20%20%20%20oe_lof_upper%0A%20%20%20%20%7D%0A%20%20%7D%0A%20%20gene2%3A%20gene(gene_symbol%3A%20%22SCN1A%22%2C%20reference_genome%3A%20GRCh37)%20%7B%0A%20%20%20%20gnomad_constraint%20%7B%0A%20%20%20%20%20%20exp_lof%0A%20%20%20%20%20%20obs_lof%0A%20%20%20%20%20%20oe_lof%0A%20%20%20%20%20%20oe_lof_lower%0A%20%20%20%20%20%20oe_lof_upper%0A%20%20%20%20%7D%0A%20%20%7D%0A%7D

abought commented 4 years ago

Released in 0.10.1, with kind thanks to the gnomAD team for their assistance.