broadinstitute / ssGSEA2.0

Single sample Gene Set Enrichment analysis (ssGSEA) and PTM Enrichment Analysis (PTM-SEA)
Other
230 stars 79 forks source link

Error while sourcing the file ssgsea-cli.R #8

Closed Manikgarg closed 5 years ago

Manikgarg commented 5 years ago

When I source the file ssgsea-cli.R, I get the following error:

source('pathToScript/ssgsea-cli.R', echo=TRUE)

!/usr/bin/env Rscript

options( warn = -1 ) suppressPackageStartupMessages( if(!require("pacman")) install.packages ("pacman") ) suppressPackageStartupMessages(p_load("optparse"))

parse the directory this file is located

this.file.dir <- commandArgs()[4] this.file.dir <- sub('^(.(/|\\)).', '\1', sub('.*?\=','', this.file.dir))

specify command line arguments

option_list <- list(

  • make_option( c("-i", "--input"), action='store', type='character', dest='input.ds', hel .... [TRUNCATED]

    parse command line parameters

    opt <- parse_args( OptionParser(option_list=option_list) )

    hard-coded parameters

    spare.cores <- 0 # use all available cpus log.file <- paste(opt$output.prefix, '_ssgsea.log.txt', sep='')

    source the actual script

    source("pathToFolder/src/ssGSEA2.0.R")

    run ssGSEA

    .... [TRUNCATED]

    Error in GSDB[[i]] : subscript out of bounds

karstenkrug commented 5 years ago

The file ssgsea-cli.R is an executable R-script that is supposed to be used as a command line interface for ssGSEA/PTM-SEA and should not be sourced into an R-session. Instead you would use the script to call R from the command line (cmd.exe on Windows OS, terminal on Linux/MAC):

To get a list of all parameters that you can specify on the command line you can use th -h flag. On Windows OS the command would look like this:

C:\path\to\ssGSEA\>Rscript ssgsea-cli.R -h
Usage: ssgsea-cli.R [options]

Options:
        -i INPUT, --input=INPUT
                Path to input GCT file.

        -o OUPTUT, --ouptut=OUPTUT
                File prefix for output files.

        -d DB, --db=DB
                Path to gene set database (GMT format).

        -n NORM, --norm=NORM
                Sample normalization: "rank", "log", "log.rank" or "none".

        -w WEIGHT, --weight=WEIGHT
                When weight==0, all genes have the same weight; if weight>0 actu
al values matter and can change the resulting score.

        -c CORREL, --correl=CORREL
                Correlation type: "rank", "z.score", "symm.rank".

        -t TEST, --test=TEST
                Test statistic: "area.under.RES", "Kolmogorov-Smirnov"

        -s SCORE, --score=SCORE
                Score type: "ES" - enrichment score,  "NES" - normalized ES

        -p PERM, --perm=PERM
                Number of permutations

        -m MINOVERLAP, --minoverlap=MINOVERLAP
                Minimal overlap between signature and data set.

        -x EXTENDEDOUTPUT, --extendedoutput=EXTENDEDOUTPUT
                It TRUE additional stats on signature coverage etc. will be incl
uded as row annotations in the GCT results files.

        -e EXPORT, --export=EXPORT
                For each signature export expression GCT files.

        -g GLOBALFDR, --globalfdr=GLOBALFDR
                If TRUE global FDR across all data columns is calculated.

        -l LIGHTSPEED, --lightspeed=LIGHTSPEED
                If TRUE processing will be parallized across gene sets. (I ran o
ut of single letters to define parameters...)

        -h, --help
                Show this help message and exit

On Linux /MAC OS you don't have to put the Rscript command in front of the script name but make sure that the file executable.

A minimal example of running ssGSEA/PTM-SEA from command line would look something like this:

C:\path\to\ssGSEA\>Rscript ssgsea-cli.R -i C:\path\to\gct\mydata.gct -d C:\path\to\ssGSEA\db\msigdb\h.all.v6.2.symbols.gmt -o testrun

I hope that helps. K

Manikgarg commented 5 years ago

Thanks @karstenkrug!