Closed AnjaliC4 closed 2 years ago
Yes, cellsnp-lite requires bam/sam/cram file as input. For 10x non-UMI single-cell data, cellsnp-lite would filter reads of type UNMAP, SECONDARY, QCFAIL, or DUP by default if --UMItag None
is set. The reads not mapped in proper pairs would also be filtered by default if --countORPHAN
is not used (i.e. probably no need to include --inclFLAG 2
).
Seems there would be little improvements by changing --exclFLAG
or --inclFLAG
if the reads were well-flagged by upstream cellranger. But you may want to try replacing -R
with -T
if a SNP list was given and -R
was used, as sometimes -T
could be faster.
Ok thank you very much for your suggestions. I read in the manual that -T Similar as -R, but the next position is accessed by streaming rather than indexing/jumping (like -T in samtools/bcftools mpileup). So can I assume that the results will be equivalent whether I use -R or -T? Another option I thought was to use MAPQ>30 rather than >20 - but in your experience, does setting it 20 result in more cells being demultiplexed? And finally could you please suggest how can I check if these --exclFLAG flags of cellSNP (UNMAP, SECONDARY, QCFAIL, or DUP) are indeed present in my 10x bam file? Thank you.
Hi, the -R
and -T
should give the same result.
Using MAPQ>30 or >20 is expected to give little difference for the result of demultiplexing.
samtools flagstat could be used for an overview of the number of alignments for each FLAG type. e.g.,
samtools flagstat <in.bam>
or use samtools view -f/-F/-G for more detailed results. e.g.,
samtools view -f 1024 <in.bam> | less
would print all reads of DUP type (duplicates).
Hi, I have very large10x cellranger bams (about 60-70gb) produced using their BWA-MEM default parameters. 10x default pipeline produces bam files that include the unmapped, chimeric, and duplicate regions. However, the fragment files they produce have the deduplicted and paired fragments but Cellsnp requires a bam file. The problem is that it is taking me more than two days to run 70g bam on cellsnp-lite. Could you please recommend how could I fasten it and which flags I should include or exclude from the 10x produced bam file. My understanding is that --exclFLAG parameter has the following set to default: UNMAP,SECONDARY,QCFAIL,DUP. Should I also include --inclFLAG 2 (2 for reads mapped in proper pairs)? Please note I have non UMI single-cell data. Please let me know your thoughts. Thank you very much!