Firionus / FastRunningMedian.jl

Efficient running median for Julia
MIT License
11 stars 1 forks source link

Explore Skip List Algorithm #19

Open Firionus opened 1 year ago

Firionus commented 1 year ago

See https://stackoverflow.com/a/10696252

Is there literature on that? Could it be faster? Does it allow for arbitrary percentiles like SortFilters?

Firionus commented 1 year ago

Preliminary results:

A fast algorithm generally will use a linear/circular buffer for the input values with pointers/indices into the sorted structure and a cache-optimized sorted structure with support for fast O(log n) insertion. Further, the data structure should support quickly finding the next or previous element of the current median, so that the new median can be quickly determined based on insertion position of the new value and removal position of the old value.

Based on this, I suggest to use a cache-optimized B-Tree as data structure. A Julian implementation might look like this:

It is hoped that by moving to Arrays, linear searches inside nodes can be made fast due to high memory locality. Keeping the data structure in such a memory local state requires additional overhead (splitting/merging operations), but it is hoped there is enough flexibility in node length to avoid such rebalances most of the time with typical data distributions. A quick implementation will have to be made, followed by benchmarks and profiles against baseline to determine the validity of these assumptions.

Firionus commented 9 months ago

I created a prototype BTree implementation. After removing type instabilities but not optimizing the algorithm it performs about 4x slower than baseline (FastRunningMedian) while scaling worse. The main slowdown comes from:

Searching in the tree is less of a problem, whereas it was the main problem with SkipLists. The implementation is currently private on my PC due to its prototype quality.

It is known that BTrees are relatively good at reading and worse at writing than some other data structures. For possible improvements with modified data structures, I plan to read this paper: http://supertech.csail.mit.edu/papers/BenderFaJa15.pdf I will pick this optimization up again at a later point in time.

EDIT: I have a new idea for optimizing: Let's make insertion and deletion lazy by letting leaf nodes be unsorted. When we delete, just mark it with a specific linear buffer index. This avoids shifting and updates in the linear buffer, massively reducing deletion cost at the cost of tree balance. We need to figure out a way to keep the tree balanced. When inserting, just insert at first gap or end. When looking for a next/previous/min/max element in a node, do a linear scan.