Closed JasperHof closed 9 months ago
So the error is with readBGI? Are you sure you need to provide a list there?
Hi Florian, thanks for the quick reply.
Ah, you are right that I should input a vector when using readBGI. However, that also did not work, using the same error. E.g., I tried:
bgi = snp_readBGI(bgifile, snp_id = "21_11098615_G_T")
Which again gave me the error "Error in format_snp_id(info_id) : Wrong format of some variants.". I get the same error when using snp_readBGEN.. so not sure what's going wrong
Thanks again,
Jasper
Okay, it seems the issue is actually with the SNP IDs derived from the BGI data.
Can you look at infos$myid, and maybe try bigsnpr:::format_snp_id on each of these until you find the one causing the issue.
Hi Florian,
Thanks again for the reply. I have run bigsnpr:::format_snp_id(infos$myid), but this returns all SNP ids (does not stop). For now I am just focusing on chr 21/22, so the first two characters are digits.
So... this is quite weird. When I am running this code, the first part works, but the final line does not:
snp_id = c("21_11098615_G_T", "21_14402660_C_T", "21_14444176_C_A")
function (snp_id) { snp_id <- ifelse(substr(snpid, 2, 2) == "", paste0("0", snp_id), snp_id) if (any(substr(snpid, 3, 3) != "")) stop("Wrong format of some variants.") snp_id }
bgi = snp_readBGI(bgifile, snp_id = snp_id) # Gives error: "Error in format_snp_id(info_id) : Wrong format of some variants"
So now, I believe there is a different cause to the problem. When loading the 'infos' variable using
infos <- dplyr::collect(dplyr::tbl(db_con, "Variant"))
Here, 'infos' does not contain my chromosome information, which is needed to match the SNPs. I entered this manually at first using 'infos$chromosome = chromosome', but I believe this has led to an error within the 'snp_readBGI' function. So, I will check how can I can get the ,bgi files to include the chromosome number :-).
Cheers,
Jasper
So, to make sure I understand correctly, the issue is that the BGI file does not provide any chromosome information?
Yes, I was unable to run readBGI/readBGEN, because my genotype/bgi files did not contain information on the chromosome number. So now, i included '-assume-chromosome' in qctool, and now the chromosome number is included. The readBGI/readBGEN functions worked for me after doing this.
Thanks again for the quick replies, maybe this can help a future user. Best,
Jasper
Hi Florian and users,
I encountered an error which I believe has been reported before, however I was not able to find the solution here on GitHub, or in the tutorials you provided. I created bgen files using qctool (8bits, using -ofiletype bgen_v1.2 -bgen-bits 8). Then, i take the following steps to try to read the .bgen/.bgi files in R:
################################
bgenfile = base filename
bgifile = paste0(bgenfile, ".bgi")
db_con <- RSQLite::dbConnect(RSQLite::SQLite(), bgifile) on.exit(RSQLite::dbDisconnect(db_con), add = TRUE) infos <- dplyr::collect(dplyr::tbl(db_con, "Variant"))
infos$chromosome = chromosome infos$myid <- with(infos, paste(chromosome, position, allele1 allele2, sep = "_"))
For example, infos$myid[1] = "21_11098615_T_G"
snps = list() snps[[1]] = infos$myid[indices]
bgi = snp_readBGI(bgifile, snp_id = snps) #################################
This gives me the error: "Error in format_snp_id(info_id) : Wrong format of some variants.". This error remains when I just try to read a single SNP (for checking). All my SNPs are biallelic, and I've also tried switching the order of alleles in infos$myid. I have re-installed my rlang, bigsnpr, and tidyr package, but nothing seems to work.
Do you have an idea what might cause this problem? Thanks in advance!
Best,
Jasper