caleblareau / gchromVAR

Cell type specific enrichments using finemapped variants and quantitative epigenetic data
https://caleblareau.github.io/gchromVAR/
MIT License
42 stars 9 forks source link

Which GWAS SNPs should I use? #11

Open smorabit opened 4 years ago

smorabit commented 4 years ago

I ran through the vignette using my scATAC-seq data for the peaks, and I downloaded a fine-mapped posterior-probabilities for a few traits of interest from CausalDB. However I found that the results are strange, where the Z-scores and deviations are extremely high for a few cells, and around zero for the remaining cells.

I am wondering if I am supposed to have posterior-probabilities for every single SNP included the GWAS, or just at the loci that reached genome-wide significance that were followed up with Baysian fine-mapping? For instance, the GWAS for Alzheimer's Disease (Jansen et al 2019) has over 1 million SNPs profiled in the GWAS itself, but CausalDB only has fine-mapping posterior-probabilities for ~10k SNPs.

caleblareau commented 4 years ago

Hi, this is a good question. Can you report back the number of non-zero peaks with PP signal?

The tool doesn’t expect that all SNPs be scored (any good fine mapping will report mostly 0s anyways). My guess is that it’s still sparse though, leading to these extreme values, which may also be a function of your peak set.

For droplet-based snATAC, we’ve found that clustering and then scoring rather than scoring individual cells has been most stable, so I’d generally recommend that where possible.

On Oct 12, 2020, at 5:56 PM, Samuel Morabito notifications@github.com<mailto:notifications@github.com> wrote:

I ran through the vignettehttps://caleblareau.github.io/gchromVAR/articles/gchromVAR_vignette.html using my scATAC-seq data for the peaks, and I downloaded a fine-mapped posterior-probabilities for a few traits of interest from CausalDBhttp://mulinlab.org/causaldb/index.html. However I found that the results are strange, where the Z-scores and deviations are extremely high for a few cells, and around zero for the remaining cells.

I am wondering if I am supposed to have posterior-probabilities for every single SNP included the GWAS, or just at the loci that reached genome-wide significance that were followed up with Baysian fine-mapping? For instance, the GWAS for Alzheimer's Disease (Jansen et al 2019) has over 1 million SNPs profiled in the GWAS itself, but CausalDB only has fine-mapping posterior-probabilities for ~10k SNPs.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/caleblareau/gchromVAR/issues/11, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AD32FYLAKPJV6XXPHJSC6YDSKOQU5ANCNFSM4SNYGR5Q.