pachterlab / kallistobustools

kallisto | bustools workflow for pre-processing single-cell RNA-seq data
https://kallistobus.tools/
MIT License
114 stars 30 forks source link

Why is there a big difference between kallisto and cellranger count result? #50

Open aizhimin opened 1 year ago

aizhimin commented 1 year ago

sample: https://www.ncbi.nlm.nih.gov/sra/SRR8315735

kallisto result: image

cellranger result: image

gaofan83 commented 1 year ago

The data you analyzed only has 25 million reads (people typically sequence 10X more reads). How many "cells" did you recover from the two pipelines?

aizhimin commented 1 year ago

@gaofan83 About 160 cells.

But why is there a big difference between kallisto and cellranger?

gaofan83 commented 1 year ago

@aizhimin 160 cells from kallisto? How about cellranger? My understanding is that the dataset you analyzed is a multi-omics data with both 5' expression and immune-seq. Are you sure you chose the correct 5' barcode whitelist as the input for kallisto?

aizhimin commented 1 year ago

@gaofan83 This is my cammand: kb count -i /mnt/data/ref/transcriptome.idx -g /mnt/data/ref/transcripts_to_genes.txt -t 8 -m 16G --mm --h5ad --filter bustools --cellranger --gene-names --verbose --overwrite -x 10XV2 -o output/SRR8315735 SRR8315735_1.fastq.gz SRR8315735_2.fastq.gz

Where is the the correct 5' barcode whitelist ?

gaofan83 commented 1 year ago

I am not familiar with kb. If you use kallisto/bustools, you will see there is an option for the whitelist. You can search 10X site and see whether 3' barcode whitelist (people typically call as 10xv2 or 10xv3) is the same as 5' barcode whitelist. https://github.com/BUStools/bustools

aizhimin commented 1 year ago

@gaofan83 10xv2 Contains 5' barcode whitelist. Look at this image from 10x site。

image