Open hammer opened 1 year ago
https://github.com/pystatgen/sgkit/issues/347 may be related
The point we're illustrating here is the power of open and extensible formats. Previously we had to convert VCFs to our own zarr formats which was time-consuming and tedious. Now we can just add a few extra fields and bits of metadata to the sgkit dataset, allowing the user to do QC directly and avoiding the need for several copies of the data (beyond pulling data out of VCF, but we'll have made the point about columnar binary storage well by this point I'd imagine).
To be assigned to @benjeffery once he's a member of our org!