statgen / locuszoom

A Javascript/d3 embeddable plugin for interactively visualizing statistical genetic data from customizable sources.
https://statgen.github.io/locuszoom/
MIT License
156 stars 29 forks source link

Better decision of when to tag elements in a credible set #209

Closed abought closed 4 years ago

abought commented 4 years ago

Purpose

The automatic credible set method will always rank SNPs, even if none of them in a given region are actually significant.

This can lead to misleading claims when there is no evidence of causality at all. We should provide a smarter way for the credible set source to operate: for example, not performing the calculation at all if all hits are below a threshold. Some design work is needed to decide what this criteria will be: a simple threshold? A hard cap on what % of SNPs in the region are "credible"? Other?

Demonstration

An example region from the demo shows the situation in question: a very large number of SNPs are included in the credible set but all are below the line of GWAS significance. https://statgen.github.io/locuszoom/examples/credible_sets.html?chrom=2&start=161046447&end=161646447

abought commented 4 years ago

After discussion with Ryan, we have implemented a rule as follows:

  1. If NO variants meet the GWAS significance threshold, then no credible set will be calculated. The significance threshold is a new, configurable source parameter.
  2. If ANY variant in the region is significant, then a credible set will be calculated for that region. This credible set will draw from the entire region: eg, it is possible to include some members that are not significant. (so long as at least one variant meets the threshold)

Closing for now; we are open to additional feedback/suggestions on improving this cutoff.