Closed shrukane closed 8 years ago
Hey Shruti, there is currently no method for that. But I will write one in and incorporate it into the code. This is a pretty useful feature. Would you just like to have the number of reads that cover each base of the reference sequences that you submitted for simulation from?
That would be great ! Thank you.
Hi, any updates ?
Hey! Sorry I was on vacation, but I have written the code this morning to the devel version to polyester. Please try it out and see if it does what you want it to do. To start polyester in devel please do:
library(devtools)
dev_mode()
install_github("alyssafrazee/polyester")
library(polyester)
Then you can run the simulation as you normally do, except modify the simulate_experiment
command slightly by inserting the argument
simulate_experiment(..., reportCoverage=T)
This will create a sample_coverages.rda
file in the same directory where you have generated the reads.
In this file, is stored a list of matrices. The names of each entry in the list corresponds to the transcript from which reads are generated. Each entry in the list is a matrix for that transcript. The i-th column of the matrix corresponds to i-th dataset generated from polyester, while the j-th row corresponds to the number of reads that will map to that position along that transcript (theoretical prefect coverage before errors are introduced during simulation).
Let me know if you need any more help!
This is sample code:
library(devtools)
dev_mode(on=T)
install_github("alyssafrazee/polyester")
library(polyester)
library(Biostrings)
# FASTA annotation
fasta_file = system.file('extdata', 'chr22.fa', package='polyester')
fasta = readDNAStringSet(fasta_file)
# subset the FASTA file to first 20 transcripts
small_fasta = fasta[1:20]
writeXStringSet(small_fasta, 'chr22_small.fa')
# here all transcripts will have ~equal FPKM
readspertx = round(20 * width(small_fasta) / 100)
# simulation call:
simulate_experiment('chr22_small.fa', reads_per_transcript=readspertx, num_reps=10,
fold_changes=1, outdir='simulated_reads', reportCoverage=T)
Look for the sample_coverages.rda
file in the same directory where the sample_*.fasta files are.
hi, sorry for the late reply. when I run my code I get the following error Error in save(coverage_matrices, file = paste0(outdir, "/sample_coverages.rda")) : object ‘coverage_matrices’ not found Any suggestions ?
Hi, I know this is a closed call but I have one more question. Is this method incorporated in the latest version of polyester ? I've been working on the latest version and i could not find the sample_coverages.rda file. -Shruti
Is there any way we can plot the distribution of reads in the simulation ?