aggregateknowledge / java-hll

Java library for the HyperLogLog algorithm
http://research.neustar.biz/2013/12/24/open-source-release-java-hll/
Apache License 2.0
311 stars 70 forks source link

MAXIMUM_LOG2M_PARAM inconsistent with storage specification and postgresql-hll #14

Open hossman opened 9 years ago

hossman commented 9 years ago

The storage specification, and postgress-hll docs each say...

https://github.com/aggregateknowledge/hll-storage-spec/blob/v1.0.0/STORAGE.md

registerWidth may take values from 1 to 8, inclusive, and log2(numberOfRegisters) may take on 4 to 31, inclusive.

https://github.com/aggregateknowledge/postgresql-hll/blob/master/README.markdown#explanation-of-parameters-and-tuning

The log-base-2 of the number of registers used in the HyperLogLog algorithm. Must be at least 4 and at most 31.

However in the javacode itself...

public static final int MAXIMUM_LOG2M_PARAM = 30;

From what i can see this means postgresql-hll can produce a serialized HLL which can not be parsed by HLL.fromBytes.