Closed bshim181 closed 7 months ago
Hi,
Thank you for your interest in the software.
I am reproducing this section from our README
:
STAR utilizes two parameters for optimal identification of multi-mappers --outFilterMultimapNmax and --outAnchorMultimapNmax. The author of STAR recommends that --winAnchorMultimapNmax should be set at twice the value used in --outFilterMultimapNmax, but no less than 50. In our study, we used the same number for both parameters (100), and found negligible differences in identifying multi-mappers. Upon further discussion with the author of STAR, we recommend that setting the same value for --winAnchorMultimapNmax and --outFilterMultimapNmax, though we highly suggest users test multiple values of --winAnchorMultimapNmax to identify the optimal value for their experiment.
In addition, we also strongly recommend against using the --outSAMmultNmax
parameter (i.e. leave it at default), as this would limit the number of alignments reported into the SAM file, though removing the benefits of multi-mapping.
We have found that the STAR parameters used for ENCODE RNA-seq mapping works for us (note the addition of --outFilterMultimapNmax
and --winAnchorMultimapNmax
, which we use to allow TE quantification):
STAR --genomeDir [STAR index] --readFilesIn [R1 FASTQ] [R2 FASTQ] \
--readFilesCommand zcat --runThreadN 10--genomeLoad NoSharedMemory \
--outFilterMultimapNmax 20 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 \
--outFilterMismatchNmax 999 --outFilterMismatchNoverReadLmax 0.04 \
--alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 \
--outSAMheaderHD @HD VN:1.4 SO:coordinate --outSAMunmapped Within \
--outFilterType BySJout --outSAMattributes NH HI AS NM MD \
--outSAMtype BAM SortedByCoordinate --sjdbScore 1 --limitBAMsortRAM 30000000000 \
--outFilterMultimapNmax 100 --winAnchorMultimapNmax 150
If your genome is vastly different to those above, (e.g. way larger or have more repetitive sequences), we recommend a saturation analysis to determine the best multi-mapping parameters (see #151 for more information).
Thanks.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days
Hello,
I am getting my attempt in TE quantification from paired rna seq data( 150bp). I was wondering if there is a recommended or a default set up for STAR alignment. What would you recommend to start out with before jumping into TE quantification using TEtranscripts?