alexdobin / STAR

RNA-seq aligner
MIT License
1.82k stars 502 forks source link

comparison between STARsolo and cellranger #1797

Open DaliBAmor opened 1 year ago

DaliBAmor commented 1 year ago

Good morning , I want to ask you if we can say that cellranger and STARsolo generated to same output in my case. Thank you

cellranger count:

Estimated Number of Cells,Mean Reads per Cell,Total Genes Detected,Median UMI Counts per Cell
"8,214","54,413","39,995","10,558"

STARsolo: Mean Reads per Cell,54200 Estimated Number of Cells,8000 Median UMI per Cell,11001 Total Gene Detected,40047

alexdobin commented 1 year ago

The numbers look quite close to me. You have to use specific parameters for STARsolo to achieve the best agreement: https://github.com/alexdobin/STAR/blob/master/docs/STARsolo.md#how-to-make-starsolo-raw-gene-counts-almost-identical-to-cellrangers

DaliBAmor commented 1 year ago

thank you Alex, I think that cellranger eliminate duplicates by default (correct me if I'm wrong plz) but with STARsolo if I want to remove PCR duplicates what I should add in my command line ?

thanks,

alexdobin commented 1 year ago

Hi @DaliBAmor

Both CellRanger and STARsolo (by default) "eliminate" UMI duplicates (i.e. reads that have the same cell barcode and UMI).

DaliBAmor commented 1 year ago

thank youu Alex

nolarifi commented 1 year ago

thanks @DaliBAmor for this question @alexdobin what about pcr duplication ? can cellranger and starsolo eliminate them by default also ? thanks in advance

alexdobin commented 1 year ago

The UMI duplication is caused predominantly by PCR duplication.