alexdobin / STAR

RNA-seq aligner
MIT License
1.79k stars 500 forks source link

Cell calling and filtering in STARsolo with SMART-seq data #1870

Open jwweii opened 1 year ago

jwweii commented 1 year ago

I've been using STARsolo for single-cell RNA-seq analysis, and I have a question about how it handles SMART-seq data. I understand that SMART-seq does not employ UMIs or cell barcodes, which are typically used by STARsolo for cell calling and filtering.

Given this, I'm curious about how STARsolo handles cell calling and filtering when dealing with SMART-seq data. I noticed that STARsolo still produces a raw and a filtered count matrix for SMART-seq data. Could you please elaborate on the criteria and mechanisms it uses for this process in the absence of UMIs and cell barcodes?

Thank you in advance for your help with this matter.

alexdobin commented 1 year ago

Hi @jwweii

The algorithms for SMART-seq are simple and do not involve cell barcodes or UMIs. The reads are already assigned to cells in separate FASTQ files. If requested, the duplicate reads are collapsed if alignments have the same start and end. STARsolo cell filtering probably should not be used for SMART-seq, as it assumes the "empty" droplets have low counts of reads.