mhammell-laboratory / TEtranscripts

A package for including transposable elements in differential enrichment analysis of sequencing datasets.
http://hammelllab.labsites.cshl.edu/software/#TEtranscripts
GNU General Public License v3.0
217 stars 29 forks source link

STAR parameters prior to TE quantification #194

Closed albertozenere closed 2 months ago

albertozenere commented 3 months ago

Hello, I am using STAR to do the alignment prior to TE quantification with TEcount and TElocal. I am using the following parameters in STAR:

STAR --runThreadN 100 \
--genomeDir $genome_path \
--outSAMtype BAM Unsorted \ 
--readFilesCommand zcat \  
--readFilesIn $forward $reverse \
  --outFileNamePrefix ${name} \
--outFilterMultimapNmax 500 \
--winAnchorMultimapNmax 500 \ 
--outFilterMultimapScoreRange 5

Is there any other parameter in STAR that is important for the TE quantification?

olivertam commented 3 months ago

Hi,

We typically don't alter the --outFilterMultimapScoreRange, since we try to get the "best" alignments if possible (as long as it maps equally well to all the possible alignments). However, you're welcome to use it. We also tend to follow some of ENCODE's approach for mapping RNA-seq libraries, but that might be because we tend to work with human datasets:

    --outFilterMultimapNmax 20 \
    --alignSJoverhangMin 8 \
    --alignSJDBoverhangMin 1 \
    --outFilterMismatchNmax 999 \
    --outFilterMismatchNoverReadLmax 0.04 \
    --alignIntronMin 20 \
    --alignIntronMax 1000000 \
    --alignMatesGapMax 1000000 \
    --outFilterType BySJout \
    --outSAMattributes NH HI AS NM MD \
    --outSAMstrandField intronMotif \
    --sjdbScore 1
albertozenere commented 3 months ago

Thank you,

I was wondering about the parameters "outFilterMultimapNmax" and "winAnchorMultimapNmax". From your manual, I see that you suggest a value of 100 for both, would it be a problem using a value of 500 instead?

olivertam commented 3 months ago

Hi,

You can certainly use the higher values. We find that for human and mouse, there's a diminishing return once you go above 100, as you will start getting low complexity repeats. However, it will differ depending on your organism/genome build.

Thanks.

albertozenere commented 2 months ago

Thank you!