JiekaiLab / scTE

MIT License
97 stars 27 forks source link

Only detecting 3 cells expressing at least 200 genes #71

Open t1nnenissen opened 1 year ago

t1nnenissen commented 1 year ago

hi there,

I ran the following on my unfiltered BAM file from a multiplexed 10X experiment:

(base) k20047529@KCLFVFGP6K8Q05N scTE % scTE -i unassigned_alignments_proliferation.bam -o out -x hg38.exclusive.idx --hdf5 False -CB CR -UMI UR DEBUG : Creating converter from 7 to 5 DEBUG : Creating converter from 5 to 7 DEBUG : Creating converter from 7 to 5 DEBUG : Creating converter from 5 to 7 INFO : Parameter list: Sample = out Reference annotation index = hg38.exclusive.idx Minimum number of genes required = 200 Minimum number of counts required = None Number of threads = 1

INFO : Loading the genome annotation index... 2023-08-11 00:16:00 INFO : Loaded 'hg38.exclusive.idx' binary file with 5400305 items ['1', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '2', '20', '21', '22', '3', '4', '5', '6', '7', '8', '9', 'M', 'X', 'Y'] INFO : Finished loading the genome annotation index... 2023-08-11 00:16:53

INFO : Processing BAM/SAM files ...2023-08-11 00:16:53 INFO : Input SAM/BAM file appears to be valid CR UR good

INFO : Done BAM/SAM files processing ...2023-08-11 00:17:54

INFO : Splitting ...2023-08-11 00:17:54 INFO : Executing single thread path INFO : Finished processing sample files 2023-08-11 00:18:13

INFO : Fetching from the annotation index... 2023-08-11 00:18:13 INFO : Done fetching... 2023-08-11 00:18:14

INFO : Calculating expression... 2023-08-11 00:18:14 INFO : Detect 3 cells expressed at least 200 genes, results output to out.csv INFO : Finished calculating expression 2023-08-11 00:18:14 INFO : Done with 0d 0h 2m 13s

However, when I look at nFeature_RNA with Seurat after demultiplexing I can see many cells that express more than 200 genes. I have no idea what's going on. Any ideas as to why the programs only detect 3 cells with more than 200 genes when this is not the case?

jphe commented 1 year ago

Dose the unassigned_alignments_proliferation.bam outputted by cellranger pipeline, how many cells were reported in that pipeline.

Otherwise, you need to check the ratio of the reads has the CR:Z and UR:Z tag in the unassigned_alignments_proliferation.bam file

t1nnenissen commented 1 year ago

Hi there,

Thanks a lot for the answer. What does dose the file mean? I have reached out to the sequencing facility who ran the cellranger pipeline to ask how many cells were reported.

Could to explain why the ratio of the reads has the CR:Z and UR:Z tag is important?