pwwang / vcfstats

Powerful statistics for VCF files
https://pwwang.github.io/vcfstats/
MIT License
65 stars 15 forks source link

VCF Mean Depth Plot Is In Reverse #30

Closed gk7279 closed 2 years ago

gk7279 commented 2 years ago

Hi-

My VCF is like this

##fileformat=VCFv4.1
##source=VarScan2
##INFO=<ID=DP,Number=1,Type=Integer,Description="Average per-sample depth of bases with Phred score >= 15">
##INFO=<ID=WT,Number=1,Type=Integer,Description="Number of samples called reference (wild-type)">
##INFO=<ID=HET,Number=1,Type=Integer,Description="Number of samples called heterozygous-variant">
##INFO=<ID=HOM,Number=1,Type=Integer,Description="Number of samples called homozygous-variant">
##INFO=<ID=NC,Number=1,Type=Integer,Description="Number of samples not called">
##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand">
##FILTER=<ID=indelError,Description="Likely artifact due to indel reads at this position">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=SDP,Number=1,Type=Integer,Description="Raw Read Depth as reported by SAMtools">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Quality Read Depth of bases with Phred score >= 15">
##FORMAT=<ID=RD,Number=1,Type=Integer,Description="Depth of reference-supporting bases (reads1)">
##FORMAT=<ID=AD,Number=1,Type=Integer,Description="Depth of variant-supporting bases (reads2)">
##FORMAT=<ID=FREQ,Number=1,Type=String,Description="Variant allele frequency">
##FORMAT=<ID=PVAL,Number=1,Type=String,Description="P-value from Fisher's Exact Test">
##FORMAT=<ID=RBQ,Number=1,Type=Integer,Description="Average quality of reference-supporting bases (qual1)">
##FORMAT=<ID=ABQ,Number=1,Type=Integer,Description="Average quality of variant-supporting bases (qual2)">
##FORMAT=<ID=RDF,Number=1,Type=Integer,Description="Depth of reference-supporting bases on forward strand (reads1plus)">
##FORMAT=<ID=RDR,Number=1,Type=Integer,Description="Depth of reference-supporting bases on reverse strand (reads1minus)">
##FORMAT=<ID=ADF,Number=1,Type=Integer,Description="Depth of variant-supporting bases on forward strand (reads2plus)">
##FORMAT=<ID=ADR,Number=1,Type=Integer,Description="Depth of variant-supporting bases on reverse strand (reads2minus)">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Sample1
1   1669766 .   A   .   .   PASS    DP=1;WT=0;HET=0;HOM=0;NC=1  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    ./.:.:1
1   1669767 .   A   .   .   PASS    DP=1;WT=0;HET=0;HOM=0;NC=1  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    ./.:.:1
1   1669768 .   T   .   .   PASS    DP=5;WT=0;HET=0;HOM=0;NC=1  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    ./.:.:5
1   1669769 .   G   .   .   PASS    DP=7;WT=0;HET=0;HOM=0;NC=1  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    ./.:.:7
1   1669770 .   T   .   .   PASS    DP=8;WT=1;HET=0;HOM=0;NC=0  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:14:8:8:8:0:0%:1E0:34:0:8:0:0:0
1   1669771 .   T   .   .   PASS    DP=9;WT=1;HET=0;HOM=0;NC=0  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:18:9:9:9:0:0%:1E0:33:0:9:0:0:0
1   1669772 .   T   .   .   PASS    DP=9;WT=1;HET=0;HOM=0;NC=0  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:18:9:9:9:0:0%:1E0:34:0:9:0:0:0
1   1669773 .   G   .   .   PASS    DP=16;WT=1;HET=0;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:29:16:16:16:0:0%:1E0:34:0:16:0:0:0
1   1669774 .   T   .   .   PASS    DP=16;WT=1;HET=0;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:29:16:16:16:0:0%:1E0:33:0:16:0:0:0
1   1669775 .   C   .   .   PASS    DP=21;WT=1;HET=0;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:40:21:21:21:0:0%:1E0:33:0:21:0:0:0
1   1669776 .   A   .   .   PASS    DP=22;WT=1;HET=0;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:40:22:22:22:0:0%:1E0:34:0:22:0:0

When I use this command

vcfstats --vcf /data/test.vcf --outdir /data/examples/ --formula 'MEAN(DEPTHs{0}) ~ CONTIG[1,2,3,4,5]' --title 'MeanDepth' --config examples/config.toml

My mean depth plots are in the negative scale to the bottom.

Can you please help?

pwwang commented 2 years ago

Mind attaching the plot?

gk7279 commented 2 years ago

meandepth col

Thanks a lot for the very quick response. The VCF file's first few lines were only included. Thanks for any timely pointers.

pwwang commented 2 years ago

The problem is that the information for your first 4 variants is truncated. You don't have a DP value for those variants.

GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    ./.:.:1
gk7279 commented 2 years ago

Checking and shall update you in a moment. Thanks.

gk7279 commented 2 years ago

It worked. Thanks a lot.

If I want to do the mean_depth for a full run (with bunch of samples), is there a way this tool can do that?

meandepth col

pwwang commented 2 years ago

Currently, you can only run them one by one.

gk7279 commented 2 years ago

Thanks. Closing it for now. Appreciate the prompt action