Make Prio queues more compact

treeowl commented 1 year ago

Store the value associated with each key as its rightmost child, which saves one word per element.

As a result, the binomial trees must become lazy, which should be good for maps and lazy traversals. The down side is that we will need tag checks to know that we have realized Succ constructors.

Benchmarks indicate this improves performance.

Closes #115.

treeowl commented 1 year ago

@konsumlamm This is obviously not quite ready to actually merge, but before I make it so, I was hoping you'd give it a glance and see if it's something you'd be okay with if indeed it improves performance.

treeowl commented 1 year ago

Note: along with reducing the total queue size by $n$ words, this reduces the number of words allocated on insertion by $2$ (amortized) and the number of words allocated on deletion by $O(\log n)$, thanks to the slightly more compact Cons nodes.

treeowl commented 1 year ago

@konsumlamm I believe this is ready to go.

treeowl commented 1 year ago

I guess there's still the question of whether to let maps be lazy (better for performance in general) or whether to make them stricter (maintaining theoretical worst-case bounds). I'd lean toward the former in this instance, but I hate having to decide.

treeowl commented 1 year ago

@konsumlamm, this still awaits your comments.

treeowl commented 1 year ago

@konsumlamm It's been two weeks now.

treeowl commented 1 year ago

I don't remember the numbers I got from benchmarks. What did they look like to you? The space savings aren't in the Zero value itself; they're in reclaiming the space we previously used to point to it (all over the place) and using that space to store attached values instead.

konsumlamm commented 1 year ago

I don't remember the numbers I got from benchmarks. What did they look like to you?

The benchmarks take 10-20% less time for me.

The space savings aren't in the Zero value itself; they're in reclaiming the space we previously used to point to it (all over the place) and using that space to store attached values instead.

I see, so we save one word per tree. This is $\log n$ in total though, not $n$, as you said, right?

konsumlamm commented 1 year ago

Ok, the folds seem to improve as well.

treeowl commented 1 year ago

I see, so we save one word per tree. This is log⁡n in total though, not n, as you said, right?

Incorrect. You can count out the Zeros in a tree of a given size to get a better understanding. Or you can just notice that we have enough room to fit all $n$ of the values where we used to have nullary Zeros.

konsumlamm commented 1 year ago

I see, so we save one word per tree. This is log⁡n in total though, not n, as you said, right?

Incorrect. You can count out the Zeros in a tree of a given size to get a better understanding. Or you can just notice that we have enough room to fit all n of the values where we used to have nullary Zeros.

Ah right, the Zeros are the leaves, not the roots, that's where my mistake was.

konsumlamm commented 1 year ago

I guess there's still the question of whether to let maps be lazy (better for performance in general) or whether to make them stricter (maintaining theoretical worst-case bounds). I'd lean toward the former in this instance, but I hate having to decide.

You mean lazy in the spine? What is the current situation? Does this PR change that?

treeowl commented 1 year ago

What were off were my allocation calculations, which I believe I overstated. I'll have to recalculate those if anyone cares.

treeowl commented 1 year ago

Lazy all through. If we want to be strict instead, I can do that, but the code will be more complicated and for the most part I expect slower.

konsumlamm commented 1 year ago

So this basically reverts #103? Did you change your mind on #100? I'd be fine with lazy maps, if you say that's more efficient.

treeowl commented 1 year ago

Well ... for now it keeps the plain (non-Prio) ones the same, as the strict constructors are better there. It more than reverts that for Prio queues, since it extends laziness into the trees (not just the spine). It all falls out of the representation change, which leads to different trade-offs.

konsumlamm commented 1 year ago

I don't like having different laziness for Prio and non-Prio queues.

Can you explain why the new representation necessitates fully lazy maps? Why can't we just keep the old implementation?

I'm a bit worried that fully lazy maps might lead to situations where you build up a lot of computations which slow down a subsequent deleteMin, or will that not happen?

treeowl commented 1 year ago

It doesn't necessitate fully lazy maps. Keeping them strict in the spine is easy, but not very useful for predictable performance. Keeping them entirely strict in the structure is quite possible too, but it's slightly trickier. That said, I just realized it should be somewhat less tricky with the Nattish maps than the old ones. We just need to stop the strict mapping at Succy Zeroy, and be lazy at Zeroy. It's still more code, but should be pretty readable. I'll give it a whirl and see how that goes.

konsumlamm commented 1 year ago

Keeping them strict in the spine is easy, but not very useful for predictable performance.

So why was that useful before, but not anymore?

treeowl commented 1 year ago

The old representation had entirely strict trees, which held lazy values, so calculating the spine strictly calculated everything (except the values) strictly. The new representation uses the same fields for values as for lists of trees, so those fields must be lazy. To calculate everything but the values strictly, we must calculate strictly down to Succy Zeroy, then calculate lazily. I suspect this won't be a big deal. With the old auxiliary class approach, it would've required an extra "layer" of computational structure, but I don't think we'll need that now.

treeowl commented 1 year ago

@konsumlamm I think maps should be back to the way they were now.

treeowl commented 1 year ago

Sorry, I double confused myself. The allocation reduction is as big as I thought.

lspitzner / pqueue

Make Prio queues more compact #118