Bioconductor / VariantAnnotation

Annotation of Genetic Variants
https://bioconductor.org/packages/VariantAnnotation
27 stars 20 forks source link

Write solo ##contig/##FILTER/etc lines as structured headers #60

Closed jmarshall closed 2 years ago

jmarshall commented 2 years ago

Some structured VCF headers have only one field in addition to ID: e.g., ##contig=<ID=1,length=100> or ##FILTER=<ID=X,Description="Y"> or other user-defined structured headers. When there is only one such header line, the existing code incorrectly prints it as an unstructured header. This causes #36 and #42.

Rsamtools (currently) distinguishes 1-field structured headers from unstructured headers only by the resulting dataframe column name: either the field name (length/Description/etc) or a generic Value for unstructured headers.

Use this to avoid misinterpreting solo ##contig/##FILTER/etc headers as simple unstructured lines.

jmarshall commented 2 years ago

Thanks for merging.

Are these fixes likely to appear on the RELEASE_3_15 release branch? I am happy to prepare an additional PR against that branch.

mtmorgan commented 2 years ago

@Kayla-Morrell can you port these to the RELEASE_3_15 branch too?

Kayla-Morrell commented 2 years ago

These changes should now be present in the release branch as well.