chr1swallace / coloc

Repo for the R package coloc
144 stars 44 forks source link

Error when using coloc.signals #42

Closed nbbarrientos closed 3 years ago

nbbarrientos commented 3 years ago

Hello,

First, I wanted to thank you for this great tool. Second, I wanted to run coloc.signals but I am encountering what seems to be a memory error and some warnings regarding sdY.est (even when I provide varbeta, MAF, and N):

Error: vector memory exhausted (limit reached?) In addition: Warning messages: 1: In sdY.est(vbeta = varbeta, maf = MAF, n = N) : estimating sdY from maf and varbeta, please directly supply sdY if known 2: In sdY.est(d$varbeta, d$MAF, d$N) : estimating sdY from maf and varbeta, please directly supply sdY if known

I have tried running it locally on Rstudio and also on a cluster and no matter how much memory I provide, it still fails. I know coloc.signals is part of the new coloc-4.0-4 version so I was wondering if this might be a bug. Thank you for your help.

Nelson Barrientos

chr1swallace commented 3 years ago

Hi Nelson,

it would help me understand how this arose if you show me how you called coloc, and the output of summary() run on each object you pass. Is that possible?

C

On Wed, 2021-01-20 at 05:55 -0800, nbbarrientos wrote:

Hello,

First, I wanted to thank you for this great tool. Second, I wanted to run coloc.signals but I am encountering what seems to be a memory error and some warnings regarding sdY.est (even when I provide varbeta, MAF, and N):

Error: vector memory exhausted (limit reached?) In addition: Warning messages: 1: In sdY.est(vbeta = varbeta, maf = MAF, n = N) : estimating sdY from maf and varbeta, please directly supply sdY if known 2: In sdY.est(d$varbeta, d$MAF, d$N) : estimating sdY from maf and varbeta, please directly supply sdY if known

I have tried running it locally on Rstudio and also on a cluster and no matter how much memory I provide, it still fails. I know coloc.signals is part of the new coloc-4.0-4 version so I was wondering if this might be a bug. Thank you for your help.

Nelson Barrientos

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

nbbarrientos commented 3 years ago

Hi Chris,

Thank you for your response. This is how I called coloc:

p = coloc.signals( dataset1 = list(pvalues = y$p, snp = y$SNP, N = 296525, type = "cc", s = 0.12, MAF = y$freq, beta = y$b, varbeta = y$varbeta), dataset2 = list(pvalues = x$pval, snp = x$SNP, type = "quant", MAF = x$MAF, N = 149, beta = x$b, varbeta = x$varbeta), method = c("single"), mode = c("iterative"), p1 = 1e-04, p2 = 1e-04, p12= 1e-05, maxhits = 3)

I'm unable to provide the output from summary() because I think it fails before it gets there. I was also wondering if an alternative would be to run coloc.abf since I'm using method = c("single")?

Thank you

chr1swallace commented 3 years ago

I mean, can you show me summary(y) summary(x) please?

On Wed, 2021-01-20 at 13:23 -0800, nbbarrientos wrote:

Hi Chris,

Thank you for your response. This is how I called coloc:

p = coloc.signals( dataset1 = list(pvalues = y$p, snp = y$SNP, N = 296525, type = "cc", s = 0.12, MAF = y$freq, beta = y$b, varbeta = y$varbeta), dataset2 = list(pvalues = x$pval, snp = x$SNP, type = "quant", MAF = x$MAF, N = 149, beta = x$b, varbeta = x$varbeta), method = c("single"), mode = c("iterative"), p1 = 1e-04, p2 = 1e-04, p12= 1e-05, maxhits = 3)

I'm unable to provide the output from summary() because I think it fails before it gets there. I was also wondering if an alternative would be to run coloc.abf since I'm using method = c("single")?

Thank you

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

nbbarrientos commented 3 years ago

Oh, my bad. These are the summaries for both:

summary(x) image

summary(y) image

chr1swallace commented 3 years ago

I suspect R is getting upset when it tries to fit a regression to 8 million observations to estimate the variance of your quantitative trait. It seems you are supplying the full GWAS summary statistics to coloc. It is not designed for that, but for investigating whether two traits share a causal variant in a small LD defined region. How small? I find with dense genotyping they are typically from 1000-10000 SNPs. See the FAQ https://github.com/chr1swallace/coloc/blob/master/FAQ.md#can-the-process-of-identifying-colocalized-variants-be-carried-out-genome-wide-or-is-it-meant-to-be-done-in-defined-small-regions

On Thu, 2021-01-21 at 05:38 -0800, nbbarrientos wrote:

Oh, my bad. These are the summaries for both:

summary(x)

summary(y)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

nbbarrientos commented 3 years ago

Interesting, I think I have an idea to address this issue. I'll give it a try by breaking the analysis into smaller pieces. Thank you very much for your help

Nelson