spotify / ffwd

a flexible metric forwarding agent
https://spotify.github.io/ffwd/
Apache License 2.0
79 stars 33 forks source link

Use different library for HyperLogLog algorithm #222

Closed malish8632 closed 3 years ago

malish8632 commented 3 years ago

This PR introduces different library that implemented HyperLogLog algo. Apparently original library was archived by the owner.

Some benchmarking was done by external engineer.

The cause for this switch was occasional issues with our Metrics API where some instances wouldn't accept metrics. Investigation lead to this log discovery which indicate some issues in library implementation:

java.lang.ArrayIndexOutOfBoundsException: Index 4 out of bounds for length 4 at com.clearspring.analytics.stream.cardinality.HyperLogLogPlus.sortEncodedSet(HyperLogLogPlus.java:796) at com.clearspring.analytics.stream.cardinality.HyperLogLogPlus.mergeTempList(HyperLogLogPlus.java:764) at com.clearspring.analytics.stream.cardinality.HyperLogLogPlus.offerHashed(HyperLogLogPlus.java:321) at com.clearspring.analytics.stream.cardinality.HyperLogLogPlus.offer(HyperLogLogPlus.java:336) at com.spotify.ffwd.output.CoreOutputManager.lambda$sendBatch$1(CoreOutputManager.java:226) at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) at com.spotify.ffwd.output.CoreOutputManager.sendBatch(CoreOutputManager.java:226) at com.spotify.ffwd.input.CoreInputManager.receiveBatch(CoreInputManager.java:88) at com.spotify.ffwd.input.InputChannelInboundHandler.channelRead(InputChannelInboundHandler.java:47) ...