zhanxw / seqminer

Query sequence data (VCF/BCF1/BCF2, Tabix, BGEN, PLINK) in R
http://zhanxw.github.io/seqminer/
Other
30 stars 12 forks source link

seqminer memory requirement #16

Open garyzhubc opened 3 years ago

garyzhubc commented 3 years ago

I have a dataset of size with 487409 samples and 571622 SNPs. Is it possible to load this entire dataset with seqminer? I tried requesting 2000 Gb memory on compute Canada but didn't work although 4874095716224 bytes per float is equal to 1,114.45483 gigabytes. Theoretically this should work. So seqminer is using a lot more memory than it theoretically should. I found readBGENToMatrixByRange is more efficient than readBGENToListByRange, but still readBGENToMatrixByRange doesn't work.