alexdobin / STAR

RNA-seq aligner
MIT License
1.85k stars 506 forks source link

Number of input reads is too low #1744

Open AudreyBrown899 opened 1 year ago

AudreyBrown899 commented 1 year ago

I am having an issue where STAR isn't processing all of my input reads. The program runs without error, but it tells me at the end that it only inputs 157 reads when there should be thousands in my transcriptome fasta file. Currently I'm unable to determine why it is doing this, or find a pattern in the reads it is recognizing.

This is the code I'm running: STAR --runMode alignReads \ --runThreadN 6 \ --genomeDir /uufs/chpc.utah.edu/common/home/werner-group1/audrey/STARgenomeindex \ --readFilesIn /uufs/chpc.utah.edu/common/home/werner-group1/audrey/japonicus_transcriptome.fasta \ --outFileNamePrefix /uufs/chpc.utah.edu/common/home/werner-group1/audrey/STARresults/japonicustest3

This is the final Log: Started job on | Jan 20 09:31:23 Started mapping on | Jan 20 09:31:40 Finished on | Jan 20 09:31:41 Mapping speed, Million of reads per hour | 0.57

                      Number of input reads |   157
                  Average input read length |   285
                                UNIQUE READS:
               Uniquely mapped reads number |   134
                    Uniquely mapped reads % |   85.35%
                      Average mapped length |   276.30
                   Number of splices: Total |   178
        Number of splices: Annotated (sjdb) |   0
                   Number of splices: GT/AG |   171
                   Number of splices: GC/AG |   5
                   Number of splices: AT/AC |   0
           Number of splices: Non-canonical |   2
                  Mismatch rate per base, % |   0.24%
                     Deletion rate per base |   0.00%
                    Deletion average length |   1.00
                    Insertion rate per base |   0.00%
                   Insertion average length |   0.00
                         MULTI-MAPPING READS:
    Number of reads mapped to multiple loci |   6
         % of reads mapped to multiple loci |   3.82%
    Number of reads mapped to too many loci |   0
         % of reads mapped to too many loci |   0.00%
                              UNMAPPED READS:

Number of reads unmapped: too many mismatches | 0 % of reads unmapped: too many mismatches | 0.00% Number of reads unmapped: too short | 17 % of reads unmapped: too short | 10.83% Number of reads unmapped: other | 0 % of reads unmapped: other | 0.00% CHIMERIC READS: Number of chimeric reads | 0 % of chimeric reads | 0.00%

alexdobin commented 1 year ago

Hi Audrey,

Are you trying to map long reads (>300b)? If so, you would need to use STARlong - or, even better, minimap2 that is designed to map long reads.

Cheers Alex