lemire / FastPriorityQueue.js

a fast heap-based priority queue in JavaScript
Apache License 2.0
357 stars 57 forks source link

'removeMany' with a stop/continue function #23

Open KrishnaPG opened 4 years ago

KrishnaPG commented 4 years ago

Currently removeMany accepts a limit, which allows the iteration to stop based on number of items.

However, when we need to remove items that fall within a range, the no. of items to visit/remove are not known before hand. Request to add a shouldContinue kind of callback to the removeMany that allows the user to control if the removal is done or should proceed to next item.

This is useful to remove a continuous sequence of values in the queue.

For example, when input data contains duplicates, just doing poll once will not be sufficient. One has to remove all elements below the next higher value.

Or, when one wants to remove all values below certain limit x, one needs to keep removing items and stop once a value higher than x is encountered.

One can do it with repeat calls to poll, but since removeMany already exists, it would be great if a continuation control method can be added to it, that gives user more dynamic control on when to stop the iteration.

KrishnaPG commented 4 years ago

On a second thought, removeMany may not be doing sequential removals as it appears to be (since user can conditionally remove or retain any item).

Perhaps. the better alterative is to a separate function, something like removeSequence:

removeSequence(stopFn, start = root) {
  // starting from the start node (by default root), keep removing nodes till stopFn() returns true
}
lemire commented 4 years ago

Sounds good... would you issue a PR?

KrishnaPG commented 4 years ago

Wish I could @lemire - but I have no clue how to implement removing the sequence from any arbitrary starting point. Starting from the root, I believe it would be a repeated conditional call to poll.

while(stopFn() == false) {
    poll()
}

But this is just a wrapper, and I do not like it. It is not the most efficient way (besides it does not start at any arbitrary point in the tree).

A little bit more involved, algorithmic kind of traversal + removal is what is required, and I am not feeling qualified enough. Maybe someone more experienced in implementing this kind of algorithms can lend a hand?

Barbarrosa commented 3 years ago

@KrishnaPG The algorithm would look similar to this.

const rejectedValues = new Array(fpq.size); // Size both arrays according to expected proportions of results
const neededValues = new Array(expectedSize)

let neededCnt = 0;
let rejectedCnt = 0;
while (!fpq.isEmpty() && !stopFn()) {
  const val = fpq.poll();
  if (iNeedIt(val)) {
    neededValues[neededCnt++] = val;
  } else {
    rejectedValues[rejectedCnt++] = val;
  }
}
neededValues.length = neededCnt;

while (rejectedCnt > 0) {
  fpq.add(rejectedValues[--rejectedCnt]);
}
KrishnaPG commented 2 years ago

One quick question: after the add operation or poll operation the underlying array is guaranteed to be in sorted order correct? i.e. a[i] <= a[i+1] for every i

KrishnaPG commented 2 years ago

I am trying to create a time-bound queue where in all items that are expired beyond a time limit need to be removed from the queue automatically.

The below works. But not sure if it is the best we can do.

function clearExpired(tbtQ) {
    let el = tbtQ.peek();
    while (el && el.expiresAt <= performance.now()) {
        tbtQ.poll(); // remove the top element;
        el.tracker.cancel(new Errors.Timeout(`Timeout`));
        el = tbtQ.peek();
    }
}

class TimeBoundTrackers extends PriorityQ {
    constructor(cleanUpIntervalMS = 1000, cleanupFn = clearExpired) {
        super((a, b) => a.expiresAt < b.expiresAt);
        this.cleanUpTimer = setInterval(cleanupFn, cleanUpIntervalMS, this);
    }
    close() {
        if (this.cleanUpTimer) {
            clearInterval(this.cleanUpTimer);
            this.cleanUpTimer = null;
        }
    }
}

Specifically the poll and peek inside a while loop, is not performant, since it "heapifies" the array content after each poll (the percolations).

Is there no way to remove continuous segments of the heap (breadth-wise scan) in one stretch and do the percolations after the removals just once?

lemire commented 2 years ago

We use a heap, not a sorted array. Maybe you need a different data structure.

Barbarrosa commented 2 years ago

@KrishnaPG You could try something like the opposite of how kSmallest works, i.e. scan for the first tree layer containing the value you wouldn't remove then truncate the preceding layers and heapify. Then you'd be left with a smaller number of manual removals, or maybe repeat a few times first if you have a lot of layers.

One consideration for why that wouldn't be a standard way of removing items is that that kind of tree modification (i.e. w/ full heapify) could be a very large (computational) commitment. One-by-one would be better for very large queues.