samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
649 stars 240 forks source link

gvcfz plugin is undocumented #2064

Closed PlatonB closed 8 months ago

PlatonB commented 8 months ago

Unfortunately, the documentation for this plugin consists of only one line, and there are no usage examples.

In my case, there is a need to compress non-variant blocks as much as possible:

chr21   5030197 .       A       <*>     0       .       .       GT:GQ:MIN_DP:PL 0/0:12:4:0,12,119
chr21   5030198 .       A       <*>     0       .       .       GT:GQ:MIN_DP:PL 0/0:12:4:0,12,119
chr21   5030197 .       A       <*>     0       .       END=5030198       GT:GQ:MIN_DP:PL 0/0:12:4:0,12,119

bcftools norm -d all -m+any doesn't merge 5030197 with 5030198. I suppose the gvcfz plugin would help. But it returns the error Missing the -g option.

pd3 commented 8 months ago

Similarly to all other bcftools commands, running the programs without parameters prints a short usage page. In case of the +gvcfz plugin, I hope it gives enough details to get one started

$ bcftools +gvcfz

About: Compress gVCF file by resizing gVCF blocks according to specified criteria.
Usage: bcftools +gvcfz [Options]
Plugin options:
   -a, --trim-alt-alleles          Trim alternate alleles not seen in the genotypes
   -e, --exclude <expr>            Exclude sites for which the expression is true
   -i, --include <expr>            Include sites for which the expression is true
   -g, --group-by EXPR             Group gVCF blocks according to the expression
   -o, --output FILE               Write gVCF output to the FILE
   -O, --output-type u|b|v|z[0-9]  u/b: un/compressed BCF, v/z: un/compressed VCF, 0-9: compression level [v]
       --write-index               Automatically index the output files [off]
Examples:
   # Compress blocks by GQ and DP. Multiple blocks separated by a semicolon can be defined
   bcftools +gvcfz input.bcf -g'PASS:GQ>60 & DP<20; PASS:GQ>40 & DP<15; Flt1:QG>20; Flt2:-'

   # Compress all non-reference sites into a single block, remove unused alternate alleles
   bcftools +gvcfz input.bcf -a -g'PASS:GT!="alt"'