zhengxwen / SeqArray

Data management of large-scale whole-genome sequence variant calls (Development version only)
http://www.bioconductor.org/packages/SeqArray
43 stars 12 forks source link

Bug :Stream Error with seqSetFilterAnnotID ( I supposed) #77

Closed iaia87 closed 2 years ago

iaia87 commented 2 years ago

Hello,

I have many .gds files. I had obtained a stream error today and I totally blocked !

I supposed that the problem come from my setFilter, but I am not sure and I completely blocked now... Any idea to resolve ?

Have you some better idea to obtain a gds by filtering on variants without creating a new vcf ?

Here you are my code :

names <- 0 for (i in 1:22) { names[i]<-paste0("chr", i) } names

gds_list_GWAS1_EUR <- list() out_GWAS1_EUR.fn <- list()

for (i in 1:length(names)) { gds_list_GWAS1_EUR[[i]] <- c( paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/gds_file/chr",i,"-GWAS1_1million_hg19.gds"), paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/gds_file/chr",i,"-GWAS1_GAIN_hg19.gds"), paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/gds_file/chr",i,"-GWAS1_HAP610_hg19.gds"), paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/gds_file/chr",i,"-GWAS1_HAP660_part1_hg19.gds"), paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/gds_file/chr",i,"-GWAS1_Omni2.5_EUR.gds"), paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/gds_file/chr",i,"-GWAS1_Omni_express_EUR.gds")) out_GWAS1_EUR.fn[[i]] <- paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/gds_file/chr", i, "-GWAS1_EUR_def.gds") names(gds_list_GWAS1_EUR)[[i]] <- names[i] names(out_GWAS1_EUR.fn)[[i]] <- names[i] } out_vcf_filtred.fn <- list()

for (i in 1:length(names)) { out_vcf_filtred.fn[[i]] <- c( paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/gds_file/chr",i,"-GWAS1_1million_hg19.gds"), paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/vcf/filtred-chr",i,"-GWAS1_GAIN_hg19.vcf.gz"), paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/vcf/filtred-chr",i,"-GWAS1_HAP610_hg19.vcf.gz"), paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/vcf/filtred-chr",i,"-GWAS1_HAP660_part1_hg19.vcf.gz"), paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/vcf/filtred-chr",i,"-GWAS1_Omni2.5_EUR.vcf.gz"), paste0("/media/lauraNFS3/Laura_ConLiGen/POST_IMPUTE_DATA_V2/vcf/filtred-chr",i,"-GWAS1_Omni_express_EUR.vcf.gz")) names(out_vcf_filtred.fn)[[i]] <- names[i] }

id <- mcols(Conligen_granges$chr22)$SNPs_ID

genofile_chr22 <- list() for (i in 1:6) { genofile_chr22[[i]] <- seqOpen(gds_list_GWAS1_EUR[[22]][i]) }

unitil now, it's OK !

for (i in 1:6) { seqSetFilterAnnotID(genofile_chr22[[i]], id, ret.idx=FALSE, verbose=TRUE)} # no problem

for (i in 1:6) { seqGDS2VCF(genofile_chr22[[i]], out_vcf_filtred.fn[[22]][i], info.var=NULL, fmt.var=NULL, chr_prefix="", use_Rsamtools=TRUE, verbose=TRUE) }

First Stream Error

I quit and now, if I pass the command of seqOpen, I obtain : Stream Read Error, need 12 byte(s) but receive 0

Sorry, if it a bit long... I hope that you can help me !

Thanks a lot, Laura

iaia87 commented 2 years ago

I create my gds file again and I solve my problem in this manner. I think that in filtering I have not closed some files and this cause my problem.

I have an other question, I want to merge files gds, after filtering by variants, but I have had to create new vcf, because I have not found how to merge on filtered files with SeqArray and SeqVarTools does not propose such option.

If you can help on this subject, I will be very grateful !

Best regards, Laura

zhengxwen commented 2 years ago

You have to create the filtered GDS files, before merging.

iaia87 commented 2 years ago

Hi,

Thanks a lot for your help ! To create filtered gds, I create a filtered vcf and after that I trasform them in filtered gds, do you know if I can do this in an other way ?

I begun to use your package and I do not find an other strategy, without loosing imputation information.

Best regards, Laura

Le ven. 3 juin 2022 à 00:14, Xiuwen Zheng @.***> a écrit :

You have to create the filtered GDS files, before merging.

— Reply to this email directly, view it on GitHub https://github.com/zhengxwen/SeqArray/issues/77#issuecomment-1145396549, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALB4SA2B7MNFE4EE36QP3KDVNEW3XANCNFSM5VU5FR4A . You are receiving this because you authored the thread.Message ID: @.***>

zhengxwen commented 2 years ago

You can use seqExport() to export a filtered GDS file to a new GDS file, which is used in merging later. seqSetFilter() can be used to set a filter on the GDS file.

iaia87 commented 2 years ago

Thanks a lot for your help, I don't know why I though that this commande create a gdsfile for SNPRelate and not for seqArray...

Thanks a lot !

Laura Lombardi

Le ven. 3 juin 2022 à 23:33, Xiuwen Zheng @.***> a écrit :

You can use seqExport() to export a filtered GDS file to a new GDS file, which is used in merging later.

— Reply to this email directly, view it on GitHub https://github.com/zhengxwen/SeqArray/issues/77#issuecomment-1146378216, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALB4SAZHWOSROICCRPPYF6DVNJ2ZBANCNFSM5VU5FR4A . You are receiving this because you authored the thread.Message ID: @.***>