Closed Kerollmops closed 4 years ago
Good point! Let's fix this.
@kerollmops also, if you intend to use bitpacking for meili, make sure to bitpack on blocks with a limit size. (typically 128 ints.)
@fulmicoton, what do you mean by a block with a limit size?
What I meant is : you do not want to compute the bit size for your entire postlist and stick to it. The reason is that the compression rate will then be determined by your largest delta. If your posting list has millions of elements you will probably end up having on outlier ruin the entire compression.
Ok so the num_bits_sorted
kind of silently ignore the rest of the slice, right?
So what I have to do is to store the computed num_bits
for each block along with the block?
It makes perfect sense to me, thank you!
Yes. I'll update doc and add an assert.
Your first block required numbits=10 and the second one numbits=15. The second block was ignored, so only 10 bits were used. The first delta for which it was insufficient happened to be exactly 2^10=1024 but that was a coincidence.
I have computed and added the num_bits
values in front of all the numbers blocks.
Thank you! This is fixed now!
(Whooops, the SIGSEGV is not fixed, I reopened)
When called with an empty array the
num_bits_sorted
triggers a segfault, this is not specified anywhere that the array must not be empty (or do I missed it?).If you want to easily reproduce the bug just create an empty
vec![]
and give it toBitPacker4x::num_bits_sorted
.