pysam-developers / pysam

Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
https://pysam.readthedocs.io/en/latest/
MIT License
773 stars 274 forks source link

Automatically translate VariantRecord on write #1270

Open kkchau opened 6 months ago

kkchau commented 6 months ago

Enable auto-translation of VariantRecord headers on write when a different header is detected.

Current functionality does not check if headers match on write, potentially resulting in (silent and dangerous) reshuffling of fields, e.g. when header records are ordered differently. Uncomment the lines that detects header differences and translates the VariantRecord to the new header to be written.

kkchau commented 6 months ago

These commented lines were added in https://github.com/kkchau/pysam/commit/c30ead4825d588fe9d06d44f0028bf86ee6c6882 and https://github.com/kkchau/pysam/commit/36eb1dae8e21d9b031635f9df5fccf51c80f7642 but commented out. Not sure if there is still work that needs to be done (if so, I can close this PR), but seems to work after a couple of tests