privefl / bigsnpr

R package for the analysis of massive SNP arrays.
https://privefl.github.io/bigsnpr/
183 stars 43 forks source link

Memory size issues in snp_cor() #439

Closed kkleinoros closed 10 months ago

kkleinoros commented 10 months ago

I am trying to run ldpred2 with sample size of 7304 and 1422515 snps.
I am calling snp_cor with nthreads=12.
Although I am running snp_cor function on one chromosome at a time, I am still running out of memory. Any solutions?

privefl commented 10 months ago

So, the issue is with snp_cor(), not LDpred2, right?

How many variants on chromosome 1? Did you try chromosome 22? Which positions are you using? And what window size?

kkleinoros commented 10 months ago

correct. 115717 variants on chromosome 1. window size = 3 / 1000 as in the tutorial. chr22 does work.

privefl commented 10 months ago

How much time for chr22? How much memory for chr1?

privefl commented 10 months ago

Please verify the vector of positions you're giving to that function.

kkleinoros commented 10 months ago

chr 22 20621 snps system.time(corr0 <- snp_cor(G, ind.col = ind.chr2, size = 3 / 1000,infos.pos = POS2[ind.chr2], ncores = 12)) user system elapsed 2819.791 3.573 260.073

privefl commented 10 months ago

Did you check POS2?

kkleinoros commented 10 months ago

POS2 is 0 for all elements.

kkleinoros commented 10 months ago

I went back to your tutorial I see I missed

To convert physical positions (in bp) to genetic positions (in cM), use

POS2 <- snp_asGeneticPos(CHR, POS, dir = "tmp-data", ncores = NCORES)

privefl commented 10 months ago

Yeah, this explains it.

So, this is working now?

kkleinoros commented 10 months ago

yes thank you