addthis / stream-lib

Stream summarizer and cardinality estimator.
Apache License 2.0
2.26k stars 556 forks source link

HyperLogLogPlus#getBytes corrupts when called from multiple thread #66

Closed yukim closed 10 years ago

yukim commented 10 years ago

Since HLLP#getBytes uses Varint.writeUnsignedVarInt(int) which is not thread safe I see corruption in returned byte array when serializing different HLLP objects from multiple threads. (stream-lib version is 2.5.1)

I think one way is to switch to Varint.writeUnsignedVarInt(int, DataOutput).

tea-dragon commented 10 years ago

I believe that switch is made in master. The non-thread safe version shouldn't really exist anymore, but I think its removal was lost in one of my local branches or left in for a test or some other reason of expedience.

tea-dragon commented 10 years ago

https://github.com/addthis/stream-lib/commit/5a9e9cc7f71e1eb49635df29bba67d905903f1b8 is the commit that should fix it.

yukim commented 10 years ago

You are right. I will try 2.6.0-rc0.