Closed YifanWei1115 closed 2 months ago
Mmm, the summary stats gives the number of SNV and indel rows, but the per-sample numbers are actual genotypes. For example, a sample can have a homozygous reference genotype 0/0 in a SNV record.
If this does not resolve the problem, please attach the full stats file.
Hi there, I've encountered some results that confuse me, since I did not find any interpretation of the output of stats. I ran the command
bcftools stats HG001_ZJ1.sorted.markdup.BQSR.vcf -s - > HG001_ZJ1.sorted.markdup.BQSR.stats
the output file .stats shows as follows (I just listed part of them):And I ran the command
plot-vcfstats HG001_ZJ1.sorted.markdup.BQSR.stats -p visual_stats
The number of SNPs for this sample in the output file (summary.pdf) was 3421056, which is different from one I get from the .stats file (number of SNPs: 6552678).
The source data used for plotting and the plot are as follows:
Same situation with the number of indels (910060 from the plot v.s. 2536999 from .stats file):
I was wondering if you could tell me the difference between the two numbers of SNPs/InDels and why they were different.
By the way, I would appreciate it if you could add some simple interpretation of outputs for plot-stats in the documentation or elsewhere.
Thanks!