bingmann / cobs

COBS - Compact Bit-Sliced Signature Index (for Genomic k-Mer Data or q-Grams)
https://panthema.net/cobs
MIT License
83 stars 15 forks source link

Reset output block after each batch when combining classic indices #18

Open Zhicheng-Liu opened 3 years ago

Zhicheng-Liu commented 3 years ago

When combining classic indices, for each batch the combinations of rows from each constituent index are written to an output block. The output block is reused for next batch.

As we use bitwise OR operation to combine rows from the constituent indices, the output block should be reset to all 0s before being reused. Otherwise, previous set bits will be carried over to next batch and accumulating false positives till the end of the batch processing loop.

Zhicheng-Liu commented 3 years ago

@bingmann and @leoisl, could you please review this pull request?