samtools / htscodecs

Custom compression for CRAM and others.
Other
30 stars 18 forks source link

Fix tok3 encoding bug with duplicated streams. #108

Closed jkbonfield closed 1 year ago

jkbonfield commented 1 year ago

The names "012345a" and "123456" have token 0 type as N_DIFF (standard delta to previous name) and token 1 DIGITS0 with DZLEN 6. N_DIFF is type 6, so we get 6,6 for first stream and 6,6 for the tok1 DZLEN stream. This then gets labelled as a dup of stream #0.

Unfortunately our "dup_from" flag is 0 for non-dupped and >0 for dup vs stream X. This meant the data written was incorrect, giving data we couldn't decode.