lemire / streamvbyte

Fast integer compression in C using the StreamVByte codec
Apache License 2.0
376 stars 37 forks source link

Adds optional function to compute required memory #33

Closed daniel-j-h closed 3 years ago

daniel-j-h commented 3 years ago

For #32 - please read for context. Sometimes it's better to trade off encoding runtime for reduced peak memory allocation.

Fixes https://github.com/lemire/streamvbyte/issues/32

lemire commented 3 years ago

Would you create a benchmark ?

What is the runtime overhead of the new function?

daniel-j-h commented 3 years ago

It's a linear scan over the input data; not sure a benchmark makes sense, it's targeting the use case where you don't care about the encoding performance just want to reduce memory allocations, like in the python use case I outlined in https://github.com/lemire/streamvbyte/issues/32#issuecomment-873651128 :smile:

daniel-j-h commented 3 years ago

Ah, I'm just realizing this only works for the default mode but not for the 0124 mode.

What about having a second function streamvbyte_compressedbytes_0124 for the 0124 scheme?

lemire commented 3 years ago

Ok. I am merging this. If you'd like to contribute streamvbyte_compressedbytes_0124, we can do it through another PR.

It's a linear scan over the input data

All of our functions are effectively linear scans... :-)

This particular function is likely quite expensive in the worse case. See assembly... https://godbolt.org/z/od937MnYY

daniel-j-h commented 3 years ago

Ok; I'll add another one for 0124 for completeness in #34.