opencb / hpg-bigdata

This repository implements converters and tools for working with NGS data in HPC or Hadoop cluster
Apache License 2.0
17 stars 14 forks source link

Add stats parameters to the variant query command line to filter by maf, mgf,... #79

Closed jtarraga closed 7 years ago

jtarraga commented 8 years ago

To enrich the variant query command line, we should to provide new parameters in order to allow users to filter by the minor allele frequency (maf), the minor genotype frequency (mgf), the number of missing alleles or genotypes... for a given study and cohort.

Some example of these parameter should be: --maf for the minor allele frequency --mgf for the minor genotype frequency --missing-allele for the number of missing alleles --missing-genotype for the number of missing genotypes

The expected value for these parameters should be (and enclosed with double quotes): study_name:population_name[<|>|<=|>=|==|!=]numeric_value

For instance: --maf "1000g:all<0.25" --mgf "1000g:all<=0.5" --missing-allele "1000g:all==5" --missing-allele "1000g:all!=0"