Closed xscapex closed 3 years ago
I think other people may have had the same problem, and reported it in other issues. Things that come to mind is "do you have enough disk space where you want to write?" (this should be checked in the latest version of {bigstatsr}) and "is it the UKBB BGEN files that they provide or some that you made yourself?". And yes, this normal that you get the .bk file because it is the first thing that is produced (but probably not filled with any data), and the .rds file is the last thing that is produced.
The BGI files are used to find where the variants are stored in the BGEN files, so we don't really care about the order here. This is also why this function need SNP IDs instead of just position indices (as for the individuals).
Hi Florian,
Thank you so much for the prompt reply, I’ll try it and update here!
I have checked my disk space and it seems that it's enough to write.
`setwd("E:\readBGEN_test1")
library(data.table) library(bigsnpr)`
###############################################################
###############################################################
write("TMPDIR = 'E:\Rtmp'", file=file.path(Sys.getenv('R_USER'), '.Renviron')) tempdir() #There are 9TB free space
FBM(500000, 1261158,backingfile="E:\readBGEN_test1\test0") #We need 4.58 T space, enough! FBM(10417, 1261158,backingfile="E:\readBGEN_test1\test1") #We need 98 G space, enough!
file.remove("test0.bk") file.remove("test1.bk")`
The bigsnpr required Rcpp package, so I have installed it and make sure the Rtools work.
`###############################################################
###############################################################
Rcpp::evalCpp("2+2") #this could output 4`
snp_readBGEN() required list_snp_id and I make sure it must be the form
`################################################################
################################################################
list_snp_id <- fread("snpid_chr21.txt",header=F)
list_snp_id <- as.list(list_snp_id)`
This is how my list_snp_id looks like:
I first read the .BGEN file which was provided by UK biobank but I got the session error.
`###############################################################
###############################################################
rds <- snp_readBGEN( bgenfiles="ukb_chr21_v3.bgen",backingfile = "test2",ncores = nb_cores(),list_snp_id)
Then I create the subset file which contain 10,417 cases but got the same error, too.
`################################################################
################################################################
cmd <- paste0("plink2 --bgen ukb_chr21_v3.bgen ref-first --sample ukb22828_c21_b0_v3_s487268.sample --keep positive_list_10417.txt --export bgen-1.2 --out ukb_chr21_v3_10417") system(cmd)
rds <- snp_readBGEN( bgenfiles="ukb_chr21_v3_10417.bgen",backingfile = "test3",ncores = nb_cores(),list_snp_id=list_snp_id)
I found that some of the list_snp_id were strange, such as 21_19467441_CAA_C. So I only use the one normal list_snp_id to read BGEN but still error,
`###############################################################
###############################################################
snp1 <- list_snp_id[[1]][1] #snp1="21_9411239_G_A"
snp1 <- as.list(snp1)
rds <- snp_readBGEN( bgenfiles="ukb_chr21_v3_10417.bgen",backingfile = "test4",ncores = nb_cores(),list_snp_id=snp1)
I have no idea how to fix the problem. Was my list_snp_id wrong?
Thanks, Monica
Can you try specifying a subset of individuals?
And also with ncores = 1
.
Are you using some specific environment, like conda?
rds <- snp_readBGEN( bgenfiles="ukb_chr21_v3_10417.bgen",backingfile = "F:\\readBGEN\\test4",ncores = 1,list_snp_id=snp1,ind_row = 1)
Then, last thing, can you try with a relative path, e.g. "test4"
instead of "F:\\readBGEN\\test4"
?
rds <- snp_readBGEN( bgenfiles="ukb_chr21_v3_10417.bgen",backingfile = "test4",ncores = 1,list_snp_id=snp1,ind_row = 1)
It still got session error, maybe I should use R 3.x.x version?
You can try, but the version that you're using is the CRAN version, right? This is checked on all kinds of architectures and the newest versions of R.
Could it be due to a corrupted file like in https://github.com/privefl/bigsnpr/issues/212?
Any update on this?
Hi Florian,
Thank you so much for the eply, I’ll try it next week and update here!
Any update on this?
Hi Florian,
Sorry for the late reply. I have checked the .bgen files and the md5 number are as same as UKBB (attached file). Still can't figure out why snp_readBGEN() could not generate .rds file. Maybe I should remove the missing value before I use snp_readBGEN().
Can you remind me of the exact issue you have?
What do you mean exactly by "Maybe I should remove the missing value before I use snp_readBGEN()
."?
Looking at the md5sums, it seems the one for chromosome 5 is different; it should be 45f95365b17d4530a42faf95a70deddd
.
Hi Florian,
Thank you so much for creating such convenient package. I have two questions about the snp_readBGEN() function and I'm wondering if you could give me some advice.
Thank you and I look forward to hear from you. Kind regards, Monica