alexdobin / STAR

RNA-seq aligner
MIT License
1.77k stars 497 forks source link

STARsolo not assigning reads to genes, FASTQs generated on a Novaseq X #2076

Open muktic opened 4 months ago

muktic commented 4 months ago

I'm running STARsolo on FASTq files that were generated from data off a NovaSeq X sequencer. It isn't generating any filtered counts, which I realized is because the barcodes.tsv and matrix.tsv files are empty. On looking at the Summary.csv, I see that STARsolo isn't assigning reads to genes:

Summary.csv Number of Reads,160006249 Reads With Valid Barcodes,1 Sequencing Saturation,-nan Q30 Bases in CB+UMI,-nan Q30 Bases in RNA read,0.905934 Reads Mapped to Genome: Unique+Multiple,0.894126 Reads Mapped to Genome: Unique,0.806095 Reads Mapped to Gene: Unique+Multipe Gene,0 Reads Mapped to Gene: Unique Gene,0 Estimated Number of Cells,11673721214437 Unique Reads in Cells Mapped to Gene,576 Fraction of Unique Reads in Cells,inf Mean Reads per Cell,992848592 Median Reads per Cell,48 UMIs in Cells,689 Mean UMI per Cell,992363488 Median UMI per Cell,992847376 Mean Gene per Cell,4695918784987350849 Median Gene per Cell,6071197751483185714 Total Gene Detected,20056497640128852

This same script has worked in the past with data off other sequencers and the platform we are using is SeqWell. Does anyone know how to fix this?

muktic commented 4 months ago

Summary.csv

This is the Summary.csv file.

alexdobin commented 4 months ago

The only thing I can think about is trying a different strandedness with --soloStrand Reverse,