Summary of a discussion with a note for future investigation:
Container Compression Headers have their APDelta flag set when the file is Coordinate Sorted, which affects the behavior of the alignmentStart / alignmentDelta fields in the Container's CramCompressionRecords. Suppose we had a Multi-Reference Slice in a Container with APDelta = true.
How would we calculate alignmentDelta between two consecutive records with differing references?
Does the concept of a delta even make sense here?
Do we have sufficient guards against this happening, if it's a problem?
This came up when I broke a test on a private branch by assuming that a Multi-Ref context automatically implies APDelta = false. Removing that assumption fixed the test.
CramContainerStreamWriter looks like a good start for this investigation, particularly DEFAULT_SLICES_PER_CONTAINER and DEFAULT_RECORDS_PER_SLICE
Note: the spec requires Multi-Ref to NOT be delta-encoded (section 10.2), so we should enforce this. Also note that we no longer store alignmentDelta after #1304
Summary of a discussion with a note for future investigation:
Container Compression Headers have their APDelta flag set when the file is Coordinate Sorted, which affects the behavior of the
alignmentStart
/alignmentDelta
fields in the Container's CramCompressionRecords. Suppose we had a Multi-Reference Slice in a Container withAPDelta = true
.alignmentDelta
between two consecutive records with differing references?This came up when I broke a test on a private branch by assuming that a Multi-Ref context automatically implies
APDelta = false
. Removing that assumption fixed the test.CramContainerStreamWriter
looks like a good start for this investigation, particularlyDEFAULT_SLICES_PER_CONTAINER
andDEFAULT_RECORDS_PER_SLICE