deweylab / RSEM

RSEM: accurate quantification of gene and isoform expression from RNA-Seq data
http://deweylab.biostat.wisc.edu/rsem/
GNU General Public License v3.0
408 stars 118 forks source link

rsem-prepare-reference with STAR using Trusted Sources #49

Open deto opened 7 years ago

deto commented 7 years ago

I ran rsem-prepare-reference in '-star' mode but also using the '--trusted-sources' to limit the source to "BestRefSeq" and "Curated Genomic". However, when running rsem-calculate-expression, after STAR alignment, there is an error that the BAM file contains many more sequences than RSEM knows.

It looks like the issue is that the --trusted-sources information is not fed into STAR when it creates it's reference, and so it uses many more transcripts than RSEM is keeping in its own files.

I think I'll just try a workaround where I pre-filter my GTF file, but I just thought I'd bring this to your attention so you could either fix or have the tool raise an error message when -star and --trusted-sources are used together.

crutching commented 7 years ago

For the record, I ran in to this same issue.

rocanja commented 2 years ago

Encountered the same problem! Seeing that this issue has been open since 2017, it would be nice to at least have a note about this problem in the vignette, please! I think this is conceptually similar to: https://github.com/deweylab/RSEM/issues/143 ... (a mismatch between annotation gtf/gff and the genome fasta seems to cause the error, but this mismatch can happen due to various reasons)