alexdobin / STAR

RNA-seq aligner
MIT License
1.86k stars 506 forks source link

STAR --runMode soloCellFiltering does not output Summary.csv file #1735

Open dwishsan opened 1 year ago

dwishsan commented 1 year ago

Hello,

With STAR version 2.7.10b, using STAR --runMode soloCellFiltering will not generate a Summary.csv file containg informations relative to the cell filtering that we can found if we launch STARsolo alignment with --soloCellFilter != none.

Namely, the informations that will be missing are:

Estimated Number of Cells Unique Reads in Cells Mapped to Gene Fraction of Unique Reads in Cells Mean Reads per Cell Median Reads per Cell UMIs in Cells Mean UMI per Cell Median UMI per Cell Mean Gene per Cell Median Gene per Cell Total Gene Detected

Is it possible to output these informations when using STAR --runMode soloCellFiltering? I think that at least some of these informations are printed in the Log.out file?

Best, Audric

alexdobin commented 1 year ago

Hi Audric,

The information about reads cannot be recovered from the matrix.mtx files which are used as input for the filtering-only run. These files contain only UMI counts. On the other hand, you can calculate the Gene/UMI statistics after filtering from the resulting matrix.mtx files.

ayeTown commented 1 year ago

Hi @alexdobin, I wanted to ask the same question and found this one. How would we go about doing this exactly? I specifically would like to know how to calculate the "unique reads in cell mapped to gene" and the "fraction of unique reads in cells". But being able to calculate all these summary numbers would be nice. How would I use the matrix.mtx file directly to calculate these? Thanks for your time.

alexdobin commented 1 year ago

Hi Alice,

the matrix.mtx contains only counts of UMIs, i.e. "collapsed duplicates", and has no information about reads. You can get the UMI/Gene related quantities from it, such as "median number of UMIs/genes per cells". To get the uncollapsed read-related quantities you would need to either run STARsolo again, or process the BAM file if it was generated.