apache / datasketches-cpp

Core C++ Sketch Library
https://datasketches.apache.org
Apache License 2.0
223 stars 71 forks source link

Get serialized size #429

Closed AlexanderSaydakov closed 4 months ago

coveralls commented 4 months ago

Pull Request Test Coverage Report for Build 9037130030

Details


Totals Coverage Status
Change from base Build 8399205905: 0.0%
Covered Lines: 16404
Relevant Lines: 16576

💛 - Coveralls
AlexanderSaydakov commented 4 months ago

Perhaps I could explain in the test. The sketch reaches capacity for the first time at 2 K 15/16, but at that point it is still in exact mode, so the serialized size is not the maximum (theta in not needed in the exact mode). So we need to catch the second time, but some updates will be ignored in the estimation mode, so I updated more than enough times keeping track of the maximum. Perhaps I should have figured out the exact number of updates given this particular sequence, but not assuming that might be even better (say, in case we change the load factor or just out of principle not to rely on implementation details too much).

jmalkin commented 4 months ago

Yeah, the test is fine. It just feels sort of overkill to serialize after every update just to check. Not quite an ideal design for quick tests but with lgK=10 it should be ok in practice.