Open kaltinel opened 3 years ago
How can you run rsem-calculate-expression normally; the bam file I got from star cannot be recognized;
error : Warning: The SAM/BAM file declares less reference sequences (194) than RSEM knows (225618)! Please make sure that you aligned your reads against transcript sequences instead of genome. RSEM can not recognize reference sequence name chr1! "rsem-parse-alignments ./index/hg38 CD4-T.temp/CD4-T CD4-T.stat/CD4-T ./STAR_v33/CD4-T/CD4-TAligned.sortedByCoord.out.bam 1 -tag XM" failed! Plase check if you provide correct parameters/options for the pipeline!
the codes as follows: /software/RSEM-1.3.3/rsem-prepare-reference -p 16 --gtf ref_annot.gtf --star --star-path /software/STAR-2.7.2b/source/ ref_genome.fa ./index/hg38 /software/RSEM-1.3.3/rsem-calculate-expression -p 16 --alignments --paired-end ./STAR_v33/CD4-T/CD4-TAligned.sortedByCoord.out.bam ./index/hg38 CD4-T
thanks very much.
Hi,
I started to use RSEM, and I am happy with what it can do! I would like to use it further, to be able to pick the most abundant isoform of each gene, therefore wanted to try
rsem-calculate-expression
.I am a little bit puzzled with its results unfortunately.
I can give an example in here for gene EEF1A1 : where the highlighted transcript is having the highest isoform abundance percentage, which is also one of the shortest one..
In the table format:
We can see that
rsem-calculate-expression
picks one of the shortest isoform, which is a non coding + retained intron type transcript as the most abundant one. However I would expect a much longer protein coding transcript to be the most abundant (which are coloured in blue for easy visualization).To be able to run
rsem-calculate-expression
, I first runSTAR
and used the output files mapped to transcriptome. And then use them asrsem-calculate-expression
input. My command line for thersem-calculate-expression
:rsem-prepare-reference --gtf gencode.v31.primary_assembly.annotation.gtf GRCh38.primary_assembly.genome.fa gencodev31
And then, I merged my STAR outputs (Transcriptome.out.bam files) for the same condition and run:rsem-calculate-expression -p 12 --fragment-length-mean 54.9 --fragment-length-sd 1.20921 --calc-pme --calc-ci --estimate-rspd --num-rspd-bins 20 --keep-intermediate-files --time --alignments merged_RNASEQ_WT_Transcriptome.out.bam gencodev31 RNAseq_WT
I would be happy to hear from you regarding where I am making a mistake in the process itself/interpretation of the results.
Thank you!