bigdatagenomics / bdg-formats

Open source formats for scalable genomic processing systems using Avro. Apache 2 licensed.
Apache License 2.0
39 stars 35 forks source link

Harmonize Variant/VariantCallingAnnotations filters #112

Closed fnothaft closed 8 years ago

fnothaft commented 8 years ago

Related to https://github.com/bigdatagenomics/adam/issues/194, and #108. Specifically, this is a subset of #108 that I'd like to get into 0.10.0.

Variant has:

  /**
   True if filters were applied for this variant. VCF column 7 "FILTER" any value other
   than the missing value.
   */
  union { null, boolean } filtersApplied = null;

  /**
   True if all filters for this variant passed. VCF column 7 "FILTER" value PASS.
   */
  union { null, boolean } filtersPassed = null;

  /**
   Zero or more filters that failed for this variant. VCF column 7 "FILTER" shared across
   all alleles in the same VCF record.
   */
  array<string> filtersFailed = [];

While VariantCallingAnnotations has:

  // FILTER: True or false implies that filters were applied and this variant PASSed or not.
  // While 'null' implies not filters were applied.
  union { null, boolean } variantIsPassing = null;
  array <string> variantFilters = [];

I'm going to make VariantCallingAnnotations match Variant.