Closed SBata closed 2 years ago
Hi,
Thank you for your interest in the software. Here are some comments/suggestions:
--quantMode
when we run our STAR, but I don't see that as a problem, so you can keep it in (it would just take longer, and somewhat redundant as TEtranscripts will try to independently quantify afterwards).no
(unstranded). Thus, if your library is stranded, it is recommended to set --stranded [forward/reverse]
accordingly.--outFilterScoreMinOverLread
and --outFilterMatchNminOverLread
parameters to your new STAR run if you like, as our recommendations are intended to supplement your current STAR parameters.--winAnchorMultimapNmax
to 150. I don't think this is the cause of the discrepancies, but we have used this on our own data without issue.xxxx.cntTable
, xxxx.gene_TE_analysis.txt
(TEtranscripts), and the STAR geneCounts (both runs) where your gene of interest (you can mask the name) is expected to be differentially expressed.Thanks.
Hi Oliver, thank you for the quick reply.
I think you hit the nail in the head with the stranded option. the library is stranded and I'd need to use the 4th column from STAR.
I went back checking the STAR output PerGeneout.tab
per each sample, when I compare that to TEtranscript, what I get is the following:
--- TE transcript for my gene ---
GeneName KD1 KD2 KD3 WT1 WT2 WT3
ENSG00000xxx 8 8 6 11 2 14
--- STAR output (based on the PerGene.out.tab)
GeneName Unstr S1 S2
ENSG00000xxx 1464 8 1456 >KD1
ENSG00000xxx 1513 8 1505 >KD2
ENSG00000xxx 1449 6 1443 >KD3
ENSG00000xxx 4435 11 4424 >WT1
ENSG00000xxx 3788 2 3786 >WT2
ENSG00000xxx 4555 14 4542 >WT3
I can tell you that the WT cells are glowing with this transcript so a count of 2 or 10 wouldn't even make sense biologically.
so this tells me that the in TEtranscript I should use the --stranded forward
option to capture the second strand cDNA library, or the 4th column in STAR...makes sense?
Hi,
S2 in the STAR output means that read 2 (in a paired end) is aligned in the direction of the transcript, which means that the library is reversed, as read 1 (single/paired end) would be in the opposite direction of the transcript. Thus, you would need the --stranded reverse
option.
Hope this post clears it up.
Thanks.
hi there
I just ran TEtranscripts using a simple 3 control vs. 3 treated RNASeq samples. I previously used those samples for other studies and quantified the reads using STAR. I used the parameters for STAR suggested in the tutorial, and the script is:
however, when I finished running TEtranscript , I could barely match the gene-level results from TEtranscript (in the xxx..gene_TE_analysis.txt file, and the filtered one), with the actual results I had before (using the
Original STAR
script above. like, it didn't even detect differential expression in my knockout gene (which I know is there, we test it in any way possible every time we run an experiment).now, is this because of the parameters needed to run TEtranscript?