alyssafrazee / polyester

Bioconductor package "polyester", devel version. RNA-seq read simulator.
http://biorxiv.org/content/early/2014/12/12/006015
89 stars 51 forks source link

Error in simulate_experiment #20

Closed qingl0331 closed 9 years ago

qingl0331 commented 9 years ago

Hello! Thank you for writing this package. I need to simulate RNAseq reads with different coverage in different transcripts, and this one seems to be the only software can do it with transcripts as its input. I only need one group ( 119 transcripts ) , but the R package seems encourage at least 2 groups, so I just have a duplicate of the same group in my code. this is my code: library(MASS) fold_changes =matrix(rnegbin(n=119,mu=5,theta=1.5), 119,2) library(polyester) library(Biostrings)

fasta = readDNAStringSet("/data1/qli/conus_transcriptome/trinity_param_twist/reads_simu/Cvic.rf.fa”)

writeXStringSet(fasta,'cvic.fa’)

readspertx = round(20 * width(fasta) / 84) simulate_experiment('cvic.fa', reads_per_transcript=readspertx,num_reps=c(119,119), fold_changes=fold_changes, outdir='simu1',readlen=84, fraglen=170, fragsd=17, error_rate=0.001) but it complains: Error in as.matrix(basemeans)[, group_id] : subscript out of bounds In addition: There were 50 or more warnings (use warnings() to see the first 50) the warning is sth like: 48: In rnbinom(n = length(basemeans), mu = basemeans, size = size) : NAs produced 49: In rnbinom(n = length(basemeans), mu = basemeans, size = size) : NAs produced 50: In rnbinom(n = length(basemeans), mu = basemeans, size = size) : NAs produced I am confused now. Would you know it? Really need your help to figure it out. Thank you!

alyssafrazee commented 9 years ago

Thanks for the report -- I'm working on looking into this and will get back to you when I find a solution. Could you please post your output from sessionInfo() here?

qingl0331 commented 9 years ago

sure~ Thx! Here it is:

sessionInfo() R version 3.2.1 (2015-06-18) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: CentOS release 6.6 (Final)

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] Biostrings_2.36.3 XVector_0.8.0 IRanges_2.2.7
[4] S4Vectors_0.6.3 BiocGenerics_0.14.0 polyester_1.4.0
[7] MASS_7.3-43 BiocInstaller_1.18.4

loaded via a namespace (and not attached): [1] zlibbioc_1.14.0 limma_3.24.15 tools_3.2.1 logspline_2.1.8

alyssafrazee commented 9 years ago

This was a documentation bug (the example code was from an old version) and is now fixed in the new simulate_experiment help page (in version 1.5.1, via b4a8619ee2bcb9e3460e9b68196927d49aaeb63d). Thanks for the report & your patience!

qingl0331 commented 9 years ago

Thx!

2015-09-08 23:41 GMT-06:00 Alyssa Frazee notifications@github.com:

This was a documentation bug (the example code was from an old version) and is now fixed in the new simulate_experiment help page (in version 1.5.1, via b4a8619 https://github.com/alyssafrazee/polyester/commit/b4a8619ee2bcb9e3460e9b68196927d49aaeb63d). Thanks for the report & your patience!

— Reply to this email directly or view it on GitHub https://github.com/alyssafrazee/polyester/issues/20#issuecomment-138790910 .