samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
633 stars 241 forks source link

Updating DP tag in the INFO field following subsetting (or merging) of samples #2175

Closed evcurran closed 2 months ago

evcurran commented 2 months ago

I used bcftools to create two subsetted VCFs from a larger VCF. The AN field automatically updates, but the DP (INFO) field does not.

To illustrate here is a variant from the original VCF (n=708):

scaffold_1  377 .   C   .   26.89   PASS    AN=2142;DP=7288;InbreedingCoeff=-0.0087;set=ReferenceInAll

Here is the same variant from subset 1 (n=239):

scaffold_1  377 .   C   .   26.89   PASS    AN=792;DP=7288;InbreedingCoeff=-0.0087;set=ReferenceInAll

And the variant in subset 2 (n=153):

scaffold_1  377 .   C   .   26.89   PASS    AN=484;DP=7288;InbreedingCoeff=-0.0087;set=ReferenceInAll

So I tried to use bcftools +fill-tags to update INFO/DP following the advice here: https://samtools.github.io/bcftools/howtos/plugin.fill-tags.html

bcftools +fill-tags subset.vcf.gz -Oz -o subset.DP.vcf.gz  -- -t 'DP=sum(FORMAT/DP)'

And I get the following error:

Error: the field FORMAT/FORMAT/DP is not present

Any pointers as to where I'm going wrong? Thanks!

pd3 commented 2 months ago

As for the first question, view will not update INFO/DP based on subset FORMAT/DP.

For the second question, the command looks correct and should work. The error message suggests that you are running an old version of bcftools. Please try with the latest, we are at 1.20 now and I am confident it will work there.