Closed ryshi06 closed 11 months ago
I can't diagnose this problem without a reproducible example.
Can you please also supply the code you used that produced the error, and the output of sessionInfo()?
Email attachments don't seem to work in replying to GitHub issues. Please go to the Issue page on the github website and paste your code into the comment box.
Generate SnpAnnot:
ref <- read.table("GDA_A1_snps.txt", sep="\t", header = FALSE) colnames(ref) <- c("snpName", "chromosome", "position", "perc_match", "strand", "TOP")
d1 <- subset(ref, select=c("snpName", "chromosome", "position"))
d1$chromosome[d1$chromosome=="X"] <- 23 d1$chromosome[d1$chromosome=="Y"] <- 25 d1$chromosome[d1$chromosome=="MT"] <- 26 d1$chromosome[d1$chromosome=="0"] <- 27
d1$chromosome[d1$chromosome=="XY"] <- 24
d1$chromosome <- as.integer(d1$chromosome) d <- d1[order(d1$chromosome, d1$position), ] d$snpID <- 1:nrow(d) d <- d[,c("snpID", "snpName", "chromosome", "position")] snpAnnot <- SnpAnnotationDataFrame(d)
meta <- varMetadata(snpAnnot) meta[c("snpID", "snpName", "chromosome", "position"), "labelDescription"] <- c("unique integer ID for SNPs (row number assigned)", "BeadSet SNP ID from Illumina", paste("integer code for chromosome: 1:22=autosomes,", "23=X, 24=pseudoautosomal, 25=Y, 26=Mitochondrial, 27=Unknown"), "base pair position on chromosome (build 37)") varMetadata(snpAnnot) <- meta
Generate ScanAnnot: d <- read.table("scanAnnot_fake.txt", sep = "\t", header = TRUE)
scanAnnot <- ScanAnnotationDataFrame(d)
meta <- varMetadata(scanAnnot) meta[c("scanID","scanName","file","sex","race"), "labelDescription"] <- c("unique ID for scans", "subject identifier", "raw data file", "Sex", "Race") varMetadata(scanAnnot) <- meta
Create gds file: path <- "." geno.file <- "tmp.geno.gds"
scan_annotation <- getAnnotation(scanAnnot) snp_annotation <- getAnnotation(snpAnnot)
col.nums <- as.integer(c(1,2,10,11)) names(col.nums) <- c("snp", "sample", "a1", "a2") diag.geno.file <- "diag.geno.RData" diag.geno <- createDataFile(path=path, geno.file, file.type="gds", variables="genotype", snp.annotation=snp_annotation, scan.annotation=scan_annotation, sep.type="\t", skip.num=10, col.total=11, col.nums=col.nums, scan.name.in.file=1, diagnostics.filename=diag.geno.file)
sample1.txt sample2.txt sample3.txt scanAnnot_fake.txt GDA_A1_snps.txt
I just ran your code and got the expected (non-empty) output:
> diag.geno
$read.file
[1] 1 1 1
$row.num
[1] 90 90 90
$samples
$samples[[1]]
[1] "sample1"
$samples[[2]]
[1] "sample2"
$samples[[3]]
[1] "sample3"
$sample.match
[1] 1 1 1
$missg
$missg[[1]]
character(0)
$missg[[2]]
character(0)
$missg[[3]]
character(0)
$snp.chk
[1] 1 1 1
$chk
[1] 1 1 1
Details on my R session:
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.4
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GWASTools_1.46.0 Biobase_2.60.0 BiocGenerics_0.46.0
loaded via a namespace (and not attached):
[1] shape_1.4.6 formula.tools_1.7.1 lattice_0.21-8 vctrs_0.6.3
[5] tools_4.3.1 generics_0.1.3 sandwich_3.0-2 tibble_3.2.1
[9] fansi_1.0.4 RSQLite_2.3.1 pan_1.9 blob_1.2.4
[13] pkgconfig_2.0.3 jomo_2.7-6 Matrix_1.6-0 data.table_1.14.8
[17] lifecycle_1.0.3 compiler_4.3.1 MatrixModels_0.5-2 codetools_0.2-19
[21] SparseM_1.81 quantreg_5.97 GWASExactHW_1.01 glmnet_4.1-8
[25] mice_3.16.0 pillar_1.9.0 nloptr_2.0.3 tidyr_1.3.0
[29] MASS_7.3-60 cachem_1.0.8 iterators_1.0.14 rpart_4.1.19
[33] boot_1.3-28.1 foreach_1.5.2 mitml_0.4-5 nlme_3.1-162
[37] tidyselect_1.2.0 dplyr_1.1.2 purrr_1.0.1 splines_4.3.1
[41] operator.tools_1.6.3 fastmap_1.1.1 grid_4.3.1 cli_3.6.1
[45] magrittr_2.0.3 survival_3.5-5 utf8_1.2.3 broom_1.0.5
[49] backports_1.4.1 bit64_4.0.5 quantsmooth_1.66.0 logistf_1.26.0
[53] bit_4.0.5 nnet_7.3-19 lme4_1.1-34 zoo_1.8-12
[57] memoise_2.0.1 DNAcopy_1.74.1 lmtest_0.9-40 mgcv_1.9-0
[61] rlang_1.1.1 Rcpp_1.0.11 glue_1.6.2 DBI_1.1.3
[65] gdsfmt_1.36.1 rstudioapi_0.15.0 minqa_1.2.5 R6_2.5.1
Thank you. Let me double check my R session info and hopefully I can get the normal output.
Best, Ruyu
From: Stephanie M. Gogarten @.> Date: Thursday, September 21, 2023 at 6:50 PM To: smgogarten/GWASTools @.> Cc: Shi, Ruyu @.>, Author @.> Subject: Re: [smgogarten/GWASTools] createDataFile returns empty diag.geno.file (Issue #15)
I just ran your code and got the expected (non-empty) output:
diag.geno
$read.file
[1] 1 1 1
$row.num
[1] 90 90 90
$samples
$samples[[1]]
[1] "sample1"
$samples[[2]]
[1] "sample2"
$samples[[3]]
[1] "sample3"
$sample.match
[1] 1 1 1
$missg
$missg[[1]]
character(0)
$missg[[2]]
character(0)
$missg[[3]]
character(0)
$snp.chk
[1] 1 1 1
$chk
[1] 1 1 1
Details on my R session:
sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.4
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GWASTools_1.46.0 Biobase_2.60.0 BiocGenerics_0.46.0
loaded via a namespace (and not attached):
[1] shape_1.4.6 formula.tools_1.7.1 lattice_0.21-8 vctrs_0.6.3
[5] tools_4.3.1 generics_0.1.3 sandwich_3.0-2 tibble_3.2.1
[9] fansi_1.0.4 RSQLite_2.3.1 pan_1.9 blob_1.2.4
[13] pkgconfig_2.0.3 jomo_2.7-6 Matrix_1.6-0 data.table_1.14.8
[17] lifecycle_1.0.3 compiler_4.3.1 MatrixModels_0.5-2 codetools_0.2-19
[21] SparseM_1.81 quantreg_5.97 GWASExactHW_1.01 glmnet_4.1-8
[25] mice_3.16.0 pillar_1.9.0 nloptr_2.0.3 tidyr_1.3.0
[29] MASS_7.3-60 cachem_1.0.8 iterators_1.0.14 rpart_4.1.19
[33] boot_1.3-28.1 foreach_1.5.2 mitml_0.4-5 nlme_3.1-162
[37] tidyselect_1.2.0 dplyr_1.1.2 purrr_1.0.1 splines_4.3.1
[41] operator.tools_1.6.3 fastmap_1.1.1 grid_4.3.1 cli_3.6.1
[45] magrittr_2.0.3 survival_3.5-5 utf8_1.2.3 broom_1.0.5
[49] backports_1.4.1 bit64_4.0.5 quantsmooth_1.66.0 logistf_1.26.0
[53] bit_4.0.5 nnet_7.3-19 lme4_1.1-34 zoo_1.8-12
[57] memoise_2.0.1 DNAcopy_1.74.1 lmtest_0.9-40 mgcv_1.9-0
[61] rlang_1.1.1 Rcpp_1.0.11 glue_1.6.2 DBI_1.1.3
[65] gdsfmt_1.36.1 rstudioapi_0.15.0 minqa_1.2.5 R6_2.5.1
— Reply to this email directly, view it on GitHubhttps://github.com/smgogarten/GWASTools/issues/15#issuecomment-1730449444, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AONEF5L7TYCPRSFGVYKPL7LX3TACFANCNFSM6AAAAAA4XIA2ME. You are receiving this because you authored the thread.Message ID: @.***>
Hi,
I have raw text files from Illumina and I am following the DataCleaning guide to prepare the snpAnnotation and scanAnnotation data frame. But when I tried to generate the gds file, the corresponding diagnostic file returns NULL for several values including sample, sample.match, etc. I have attached the empty output I received. I double-checked the file path, the two annotation dataframe and the raw data files are good. Can you give me some idea why I am having this issue? Thank you!