Closed Squalene closed 3 months ago
Thanks for reporting this. I suspect this is related to serialization/deserialization of the registers when in sparse representation. I will investigate. As a temporary fix, you can do sparse=False
in the HyperLogLog constructor e.g. hll = HyperLogLog(p=8, seed=0, sparse=False)
.
This indeed solves the issue, thank you.
This should be fixed on #47.
Hi,
First of all, thank you very much for this implementation. While playing with the library, I found out that serializing and deserializing a HyperLogLog object and then merging it to another leads to a big drop in accuracy. Here is the code to reproduce:
Python: 3.9.16 HLL: 2.0.3
gives
I have seen the issues resolved previously and indeed my registers are all the same before and after serialization/deserialization so I suspect the error to be somewhere else but I am not familiar enough with the codebase to find it.
Thank you in advance for your help