samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
662 stars 240 forks source link

bcftools view -S data size vcf #2170

Closed hitwbt closed 4 months ago

hitwbt commented 5 months ago

Hi, may I ask why when I use bcftools view -S 1.txt FAM596.vcf.gz -Oz > NA19919.vcf.gz command to filter the vcf of NA19919 samples, the output single sample vcf (1.16G) is bigger than the original three sample vcf (1.13G), shouldn't it be equal to one-third of the FAM596.vcf.gz?

pd3 commented 5 months ago

Possibly, it depends how big are the mandatory columns (CHROM-INFO) compared to the FORMAT fields. Why don't you look in the output file and compare it with the input file? Also it matters if you are comparing uncompressed or compressed files - compression can decrease the size differences when the data is easily compressible, i.e. has low information entropy.