Open laraabastoss opened 3 months ago
Hi @laraabastoss, thanks for your contribution! Recently some errors were fixed in the automated tests, so I am re-running them for this PR. Let's see how that goes and if you need to change something in your code. Perhaps you will need to pull the latest changes from the main
branch.
Aside from that, I wanted to discuss a scope question. River already has a Heavy Hitters algorithm that is bound to provide the same functionality as Space Saving. I noticed that the current implementation in River supports a fading factor. I do not know the pros and cons of Space Saving vs Lossy Count with Forgetting Factor (the core of River's version), but I think we could do some renaming to keep both versions.
The idea is to follow the convention we followed so far for the stuff in river.sketch
:
Counter
, Set
, and so on. The algorithm name and related info go in the documentation. So, in your case:
sketch.HeavyHitters
, like FadingHeavyHitters
or something else -- suggestions are welcome) collections
module for API usage. This brings familiarity to the users and brings name choices tested by time :D. You can check the existing methods in the sketch
module for inspiration.
Added coded and respective documentation for the Space Saving, HyperLogLog and Hierarchical Heavy Hitters algorithms within the sketch section.