RTIInternational / biocloud_docker_tools

Docker library providing a catalog of images for RTI's cloud-based bioinformatics toolkit.
https://hub.docker.com/u/rtibiocloud
5 stars 11 forks source link

saige_gds #22

Closed earleyej closed 1 year ago

earleyej commented 1 year ago

Created a new docker image to run SAIGEgds.

jaamarks commented 1 year ago

LGTM - build success on my end.

For future improvements, I would suggest using optparse to parse arguments (see example below).

example using optparse ```R #!/usr/local/bin/Rscript # saige_gds.R - Script for running SAIGE analysis on genotype and phenotype data library(SeqArray, quietly=TRUE) library(SAIGEgds, quietly=TRUE) library(SNPRelate, quietly=TRUE) library(optparse, quietly=TRUE) # Create an option parser option_list <- list( make_option(c('--geno'), type='character', help='Path to genotype GDS file (required)'), make_option(c('--pheno'), type='character', help='Path to phenotype file (required)'), make_option(c('--grm'), type='character', help='Path to GRM file (required)'), make_option(c('--out'), type='character', help='Path to output file (required)') ) parser <- OptionParser(usage = "Usage: saige_gds.R [options]", option_list = option_list) # Parse command-line arguments args <- parse_args(parser) # Check for required arguments and display help if not provided if (is.null(args$geno) || is.null(args$pheno) || is.null(args$grm) || is.null(args$out)) { cat("Missing required argument(s)!\n") cat("Use --help for usage information.\n") q(status=1) } geno <- args$geno pheno_file <- args$pheno grm <- args$grm out_file <- args$out # read phenotype data pheno <- read.table(pheno_file, sep=" ", header=T, stringsAsFactors=F) colnames(pheno)[1]<-"sample.id" # Fit null model sampid <- seqGetData(grm, "sample.id") # sample IDs in the genotype file ##Assume that col 1 = sample ID # col 2 = phenotype # all other ols are covariates vars <- colnames(pheno) frm <- as.formula( paste0(vars[2]," ~ ",paste(vars[3:length(vars)],collapse="+")) ) glmm <- seqFitNullGLMM_SPA(frm, pheno, grm, trait.type="quantitative", sample.col="sample.id") # P-value calculation # assoc <- seqAssocGLMM_SPA(geno, glmm, mac=10, parallel=2, maf=0.01) # Save to file write.table(assoc, file=out_file, sep="\t", col.names=T, row.names=F, quote=F) ```