sgkit-dev / vcf-zarr-spec

VCF Zarr Specification
Apache License 2.0
11 stars 2 forks source link

Add non-redundant header entries as a list of name-value tuples #29

Open jeromekelleher opened 1 month ago

jeromekelleher commented 1 month ago

Rather than storing the entire header as a single string attribute, we should store all non-redundant header entries as a list of name,value tuples.

Non redundant here means not storing CONTIG, INFO, FILTER and FORMAT header entries as these are already fully encoded elsewhere.

We store the header metadata as a list of (name, value) tuples because there is no requirement that keys are unique. Header information SHOULD be stored in the same order that the items appear in the original header.

I think this is sufficient for lossless round-tripping.