samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
664 stars 240 forks source link

bcftools merge gvcf unexpected output #1940

Open juliagrp opened 1 year ago

juliagrp commented 1 year ago

Hi! Im doing bcftools mile up and call for obtaining gVCFs for each sample. I call them separately because of time computing. As reduced example: bcftools mpileup --threads 8 -a AD,DP,SP -Ou -f $reference_genome $name"_sorted_filtered.bam" | bcftools call -g 1 -f GQ,GP -m -Ob -o calls.bcf bcftools view -Oz calls.bcf -o calls.gvcf.gz bcftools norm -Oz -f $reference_genome -d all calls.gvcf.gz -o $name"_calls_norm.gvcf.gz" tabix -p vcf $name"_calls_norm.gvcf.gz"

Now I want to merge all gvcf samples. For that end I tested before with 3 samples, each one in a gvcf file with this command: bcftools merge --file-list mini_sample_list.txt -g $reference_genome --force-samples > mini_missing_merge.gvcf

And the firsts lines of the merged gvcf file are not the ones I expected. Here you can see the gvcf merged in the top left. Top right the sample N20, bottom left N130, bottom right N142.

MicrosoftTeams-image

In the first line, pos 1001 everything runs as expected. But in line 2, pos 1003 its expected: N20 N130 N142 chr1 1003 ./. ./. 0/0 And the result given by merge is: N20 N130 N142 chr1 1003 0/0 0/0 0/0

In the 3rd line happened something like this too. The expected from the individual gVCFs files in the final gvcf merged should be like this, or just not appeared as a SNP, because none of the samples have a value for this position. N20 N130 N142 chr1 1024 ./. ./. ./. And the result given by merge is: N20 N130 N142 chr1 1003 0/0 ./. 0/0

Cannot understeand how its working...

What should I change in my code for obtaining the expected results?

pd3 commented 1 year ago

It would be better to attach a small test case (that is, VCFs) instead of screenshots as this does not allow us to run tests and debug the problem properly.

Nonetheless, the headers show that an old version 1.15 of bcftools was used. Can you please repeat with the latest github version? There were important fixes which likely solve also your problem.

http://samtools.github.io/bcftools/howtos/install.html