alyssafrazee / polyester

Bioconductor package "polyester", devel version. RNA-seq read simulator.
http://biorxiv.org/content/early/2014/12/12/006015
89 stars 51 forks source link

Error when report_coverage is set to true #74

Closed NatPRoach closed 3 years ago

NatPRoach commented 3 years ago

Hello, I'm attempting to get a working wrapper for polyester for the Galaxy project. We're testing all of the parameters, and noticed that when report_coverage is TRUE the program fails with the following error.

Error in coverage_matrices[[target]] : 
  attempt to select less than one element in get1index
Calls: simulate_experiment -> sgseq

Any feedback you could provide on getting this parameter to work would be greatly appreciated. The full script being generated by this particular run of our wrapper is the following:

### This just shuttles error to stderr so galaxy can collect it
options(show.error.messages=F, error=function(){cat(geterrmessage(), file=stderr()); q("no",1,F)})
loc <- Sys.setlocale("LC_MESSAGES", "en_US.UTF-8")

library(polyester)
library(Biostrings)

fasta <- '/Users/nproach/Documents/galaxy/database/files/001/dataset_1561.dat'
fasta_file <- readDNAStringSet(fasta)
seqpath <- NULL
gtf <- NULL

num_reps = c(
        2,
        2
)

reads_per_transcript <- matrix(rep(2, length(fasta_file)), ncol=1)
meanmodel <- FALSE

size <- NULL

temp <- read.delim('/Users/nproach/Documents/galaxy/database/files/001/dataset_1619.dat', header=FALSE)
fold_changes <- as.matrix(temp)

paired <- FALSE

reportCoverage <- TRUE

readlen <- 100

distr <- 'normal'
fraglen <- 250.0
fragsd <- 25.0

error_rate <- 0.005
error_model <- 'uniform'

bias <- 'none'

strand_specific <- FALSE

seed <- 42

simulate_experiment(fasta = fasta,
                    gtf = gtf,
                    seqpath = seqpath,
                    num_reps = num_reps,
                    reads_per_transcript = reads_per_transcript,
                    size = size,
                    fold_changes=fold_changes,
                    paired = paired,
                    reportCoverage = reportCoverage,
                    readlen = readlen,
                    distr = distr,
                    fraglen = fraglen,
                    fragsd = fragsd,
                    error_model = error_model,
                    error_rate = error_rate,
                    bias = bias,
                    strand_specific = strand_specific,
                    seed = seed
                    )
NatPRoach commented 3 years ago

Looking into this a bit further, it seems that this error occurs when there are duplicate nucleotide sequences with different names in the FASTA file being simulated from. So it seems like this was user error.