Closed gianmaz closed 3 years ago
Hi,
Wow, I didn't see this in... nearly 2 years. Sorry!
@nellore might be able to respond this question more precisely, but my understanding is that Rail-RNA will produce the same results if it's run with the same set of samples. We didn't run samples individually, as Rail-RNA scales up if you run samples in small batches. This could explain small differences if you re-run Rail-RNA with a small subset of samples.
However, large differences as the ones you noticed, are likely due to how many reads were processed by Rail-RNA. For example, maybe when we ran Rail-RNA we had an issue downloading the data from SRA and only a portion of it got downloaded. That's part of the information we provide in the colData(rse_gene)
information (or in the phenotype text table file).
For the prep_merge.R
error, a reprex
would be useful to help you more.
Finally, note that recount3
is coming out soon that was developed by @ChristopherWilks.
Best, Leo
I'm closing this since recount3
is available on Bioconductor now and the pre-print is around the corner. Plus this issue has been inactive for a while. Though if you still have questions, please feel free to re-open it. Thanks!
To whom, it may concern. I would like to run the same pipeline used to generate Recount2 in my dataset. To check if the rail-RNA pipe worked correctly I ran it on a small project uploaded in Recount2 (SRP047399)
Rail-RNA was run using this command:
rail-rna go local -x dir_bowtie_hg38 dir_bowtie2_hg38 -m manifest_file --log dir_log_file -d tsv,bw,jx -f --scratch temp_folder_dir
based on Bowtie and Bowtie2 indexes for the reference genome hg38. Downloaded from:
I genome Illumina:
I used the following data to run the R scripts after the rail-RNA pipeline.
However when I match the gene counts or exon counts that I obtained against those uploaded in Recount2 I found some differences. In some cases, these differences resulted in more than 100.000 read counts
Moreover, I get this error when I run the prep_merge.R script oo <- findOverlaps(jx_gr, introns_unique, type = 'equal') stopifnot(length(unique(queryHits(oo))) == length(oo))
Can this be due to the use of wrong annotations?
I read that the tool is deterministic, I expected to obtain the same results. I check the number of reads in input and they are the same. However, the number of mapped reads is different
Do you have any idea why this happens? Considering that I would like to compare my samples with the data in Recount. Please, can you confirm that the pipeline and the reference and the annotation files are correct?
Thank you. Best Regards, Gianluca
Please find below an example for one of the sample.