The automatic credible set method will always rank SNPs, even if none of them in a given region are actually significant.
This can lead to misleading claims when there is no evidence of causality at all. We should provide a smarter way for the credible set source to operate: for example, not performing the calculation at all if all hits are below a threshold. Some design work is needed to decide what this criteria will be: a simple threshold? A hard cap on what % of SNPs in the region are "credible"? Other?
After discussion with Ryan, we have implemented a rule as follows:
If NO variants meet the GWAS significance threshold, then no credible set will be calculated. The significance threshold is a new, configurable source parameter.
If ANY variant in the region is significant, then a credible set will be calculated for that region. This credible set will draw from the entire region: eg, it is possible to include some members that are not significant. (so long as at least one variant meets the threshold)
Closing for now; we are open to additional feedback/suggestions on improving this cutoff.
Purpose
The automatic credible set method will always rank SNPs, even if none of them in a given region are actually significant.
This can lead to misleading claims when there is no evidence of causality at all. We should provide a smarter way for the credible set source to operate: for example, not performing the calculation at all if all hits are below a threshold. Some design work is needed to decide what this criteria will be: a simple threshold? A hard cap on what % of SNPs in the region are "credible"? Other?
Demonstration
An example region from the demo shows the situation in question: a very large number of SNPs are included in the credible set but all are below the line of GWAS significance. https://statgen.github.io/locuszoom/examples/credible_sets.html?chrom=2&start=161046447&end=161646447