Closed prasundutta87 closed 3 years ago
Hi, this is not possible, but I just added this functionality into the fill-tags plugin. For example:
bcftools +fill-tags test/fill-tags-VAF.vcf -- -t VAF
Thanks @pd3 ..do I need to update bcftools for this to work?
Yes, you'll need the latest version from github, please see here http://samtools.github.io/bcftools/howtos/install.html
Okay.. I actually installed bcftools using bioconda..and I am using bcftools v1.11. Is there any way if I only want to update the plugins using bioconda? I actually have a lot of permission issues as I am working on a cluster, so installing/updating anything can be an issue for me.
You could take just the plugin source code and compile. Or wait for the next numbered version, a new release is imminent.
Thanks a lot for this @pd3! :-)
So this works but why if you are already creating the total AD in the denominator for this calculation, there is not total AD tag to fill?
I don't understand the question. What do you mean by 'total AD'? Perhaps the total depth? If yes, then you would want to add a DP tag. As why it is not there already, well, because someone would have to take the time and do it, and time is a valuable commodity.
The denominator for calculating the VAF is the sum of the comma separated FMT/AD tag. You already calculate adding these together when filling the VAF tag. So it would be nice to have either an INFO/DP tag or a FMT/DP tag to fill with the sum of this comma separated AD tag (the denominator for the VAF). Pindel only gives the AD tag in the format field, i.e. 45,12
. But they don’t add this up for a total DP tag. They have absolutely no DP tag whatsoever. This would be nice to have especially because you already are doing it in the denominator of the VAF.
I can post an example of Pindel output when I get to office.
This is possible via the command
bcftools +fill-tags input.vcf -- -t 'DP=sum(AD)'
For me this only gives the reference
depth total.
input:
GT:AD 0/0:22,2
output goes to INFO/DP:
DP=22 GT:AD 0/0:22,2
correct output should be:
DP=24 GT:AD 0/0:22,2
Here is an example input where the DP should equal 112(111+1) and 87(83+84):
It should be something like bcftools filter
where you can do AD[sample:subfield] such as DP=sum(AD[0:0], AD[0:1])
or however you are calulating the total denominator for VAF.
##fileDate=20161216
##source=pindel
##reference=GRCh38DH
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##INFO=<ID=HOMLEN,Number=1,Type=Integer,Description="Length of base pair identical micro-homology at event breakpoints">
##INFO=<ID=PF,Number=1,Type=Integer,Description="The number of samples carry the variant">
##INFO=<ID=HOMSEQ,Number=.,Type=String,Description="Sequence of base pair identical micro-homology at event breakpoints">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Difference in length between REF and ALT alleles">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=NTLEN,Number=.,Type=Integer,Description="Number of bases inserted in place of deleted code">
##FORMAT=<ID=PL,Number=3,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=RD,Number=1,Type=Integer,Description="Reference depth, how many reads support the reference">
##FORMAT=<ID=AD,Number=2,Type=Integer,Description="Allele depth, how many reads support this allele">
##bcftools_viewVersion=1.12-60-g7100c81+htslib-1.12-50-g0313654
##bcftools_viewCommand=view 1076180_23153_0_0/1076180_23153_0_0.vcf.gz; Date=Mon Jun 14 12:59:19 2021
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT UKB_1143822_0230759086
chr1 69246 . GTACTTCCTGCTAGCCAACCTCTCACTCATTGATCTGTCTCTGTCTTCAGTCACAGCCCCCAAGATGAT G . PASS END=69314;HOMLEN=4;HOMSEQ=TACT;SVLEN=-68;SVTYPE=DEL GT:AD 0/0:111,1
chr1 451649 . A AG . PASS END=451649;HOMLEN=1;HOMSEQ=G;SVLEN=1;SVTYPE=INS GT:AD 0/0:83,4
Ah, you are right. This has now been fixed and extended to allow custom number of fields to output (fixed number or variable number of fields) and the output type (float or integer). For example, try:
# generates Number=.,Type=Float
bcftools +fill-tags input.vcf -- -t 'DP=sum(FORMAT/AD)'
# generates Numbe=1,Type=Integer
bcftools +fill-tags input.vcf -- -t 'DP:1=int(sum(FORMAT/AD))'
I hope this helps.
Cool thanks for this. Looks like you updated a lot so I appreciate it.
On Wed, Jun 16, 2021 at 4:18 AM Petr Danecek @.***> wrote:
Ah, you are right. This has now been fixed and extended to allow custom number of fields to output (fixed number or variable number of fields) and the output type (float or integer). For example, try:
generates Number=.,Type=Float
bcftools +fill-tags input.vcf -- -t 'DP=sum(FORMAT/AD)'
generates Numbe=1,Type=Integer
bcftools +fill-tags input.vcf -- -t 'DP:1=int(sum(FORMAT/AD))'
I hope this helps.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/samtools/bcftools/issues/1422#issuecomment-862197172, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASU5KNNOHY673Z3OJMJ5P5LTTBT57ANCNFSM4X4IFDDQ .
Thanks Petr, this functionality was just super useful to me.
Hi,
Can I please be helped with a way on how to solve this query? I have to add an additional tag for variant allele fraction (VAF) in my VCF INFO area or (probably FORMAT also) in my single sample VCF file.
In my FORMAT/AD, I have two values, for ex., 43,45 where the numbers represent Allelic depths for the ref and alt alleles for a sample in the order listed.
How can I utilise the two values and calculate VAF using the formula alt/ref+alt and also add it to my VCF INFO or FORMAT area?
I have tried bcftools query -f '[%AD\n]' which gives 43,45. But, I don't know how to separate them in bcftools and use it to do the VAF calculation and add it in the VCF file.
Regards, Prasun
I am using bcftools v1.11.