Closed alanfoleynibrt closed 2 years ago
Hi, --UMItag None
means cellsnp-lite will not use UMI information and will parse each read independently, it can still pileup a list of variants like in the case of 10x scDNA data. As you have already obtained the variant list with --UMItag MR
, you may filter part of them with bcftools
or by rerunning cellsnp-lite with proper --minCOUNT
and --minMAF
(default values are 20 and 0, respectively) if you want.
Thank you for your reply
I'm having a strange issue now.
I run this:
"/home/alan/AbSeq/SNV/MQuad/cellsnplite/cellsnp-lite/cellsnp-lite" \ -s "/home/alan/AbSeq/yujuanKX/sevenbridgesoutput/gatkbams/45712HENGLI.bam" \ -O "/home/alan/AbSeq/yujuanKX/sevenbridgesoutput/gatkbams/mquadtest/" \ -b "/home/alan/AbSeq/SNV/MQuad/kxcellsnplite/barcodeskx.tsv" \ -p 20 \ --cellTAG CB \ --UMItag None \ --chrom KX576660.1 \ --minMAF 0.02 \ --minMAPQ 5
and I have an output of 1419 SNPs which are missing certain variants. For example, I know from viewing my 45712HENGLI.bam in IGV there is a variant at 2235 C>T of allele depth 18 and sequencing depth 18.
However, when I view the base.vcf file I see:
KX576660.1 2219 . C A . PASS AD=1;DP=20;OTH=0 KX576660.1 2221 . T A . PASS AD=1;DP=19;OTH=1 KX576660.1 2230 . G T . PASS AD=2;DP=20;OTH=0 KX576660.1 2231 . A T . PASS AD=1;DP=20;OTH=0 KX576660.1 2412 . T A . PASS AD=2;DP=20;OTH=0 KX576660.1 2414 . G T . PASS AD=1;DP=22;OTH=0
The reference genome is correct, and the variant calls at the other positions are correct.
But a major variant at 2235 is missing.
This is strange because when I run mpileup directly:
samtools mpileup \ "/home/alan/AbSeq/yujuanKX/sevenbridgesoutput/gatkbams/45712HENGLI.bam" \ -f "/home/alan/AbSeq/SNV/mpileup/KX576660.fasta" \ -o /home/alan/AbSeq/yujuanKX/sevenbridgesoutput/gatkbams/mquadtest/mpileupoutput
I see that the variant at 2235 is present:
KX576660.1 2233 t 19 ..$................. F;JIFFJFFJ:FJJFFFF: KX576660.1 2234 t 17 ...-1C.............. FJHFFJFFJFFJJFFFF KX576660.1 2235 c 19 TT*TTTTTTTTTTTTTTTT FJJFFJFFFJFFJJFFFFF KX576660.1 2236 g 19 ................... FJJFFJFF:JFFJJFF:FF KX576660.1 2237 g 18 .................. FJJFFJ:FJFFJJFFFFF
Can you advise on what is happening here?
Alan
The "missing" variants were probably filtered by --minCOUNT
or --minMAF
(default values are 20 and 0, respectively). The variants whose "DP+OTH < minCOUNT" or "min(AD, DP-AD) < DP * minMAF" would be filtered. In the case of pos 2235, it was expected that the variant would be filtered (depth 19 is smaller than minCOUNT 20; minor allele fraction 0 is smaller than minMAF 0.02 as you specified). You may try using looser filtering parameters to "call" more variants.
Xianjie
Many thanks Xianjie!
This did fix it
Hi I am trying to use cellsnp-lite to analyse scRNAseq data from the BD Rhapsody.
I put my fastqs though the BD Rhapsody™ WTA Analysis Pipeline Revision 6.
It outputted a BAM file with sorting based on Cell barcode (I have 5500 cells). I split the BAM file into just "MT" chromosome reads.
When I try to analyse this MT BAM file with cellsnp-lite I run this:
"/home/alan/AbSeq/SNV/MQuad/cellsnplite/cellsnp-lite/cellsnp-lite" \ -s "/home/alan/AbSeq/seven_bridges_output/yujuanjustmt.bam" \ -O /home/alan/AbSeq/SNV/MQuad/cellsnplite/ \ -b "/home/alan/AbSeq/seven_bridges_output/barcodes2.tsv" \ -p 20 \ --cellTAG CB \ --chrom MT
it runs with:
[I::main] start time: 2022-05-19 15:00:01 [W::check_args] Max depth set to maximum value (2147483647) [I::main] mode 2a: pileup 1 whole chromosomes in 5581 single cells. [W::csp_pileup_core] Combined max depth is above 1M. Potential memory hog! [I::csp_pileup_core][Thread-0] processing chrom MT ...
but the output is an empty cellSNP.base.vcf file.
The BAM file should have cell barcodes (which I have provided) and UMIs.
Is there something I'm doing wrong?
When I run with "--UMItag None" I get a cellSNP.base.vcf file with a list of variants at every single nucleotide position - which must be incorrect.
When I run with "--UMItag MR" which I believe is the UMItag for my BAM file, again I get a huge number of variants at all positions.
Many thanks
Alan