Closed treeowl closed 1 year ago
This has extra checking, but seems simple. I have another version in progress that adds a constructor to MExtract
that might be worth comparing; I find it rather more confusing, though, so it would only be worthwhile if it's significantly faster.
Would it be possible to add a test for this?
@konsumlamm I've added a test for plain queues. I'd like you to take a look before I copy that over for Prio
queues. The whole testing thing is awkward because we don't export internals and Cabal doesn't allow two .cabal
files in the same directory. containers
works around that using symlinks, which cause complications testing on Windows. So ... lots of fun all around.
Another approach would be to add a pqueue-internals
package, as suggested long ago in another context by @stephen-smith. That's a peculiarly principled way to expose internals to both the test suite and the world—it's possible for the internals
package to go up a major version when the main package only goes up a minor version, or even a patch level.
If we want, we can actually enforce the shape invariant using a GADT. Here's a definition that merges that change with getting rid of Zero
:
data BinomTree rk a = BinomTree !a !(rk a)
newtype One a = One a
data Succ rk a = Succ !(BinomTree rk a) !(rk a)
data BinomForest nil rk a where
Nil :: BinomForest 'True rk a
Skip :: BinomForest 'False (Succ rk) a -> BinomForest 'False rk a
Cons :: !(BinomTree rk a) -> BinomForest nil (Succ rk) a -> BinomForest 'False rk a
data BinomHeap a where
BinomEmpty :: BinomHeap a
BinomCons :: !a -> BinomForest nil One a -> BinomHeap a
BinomSkip :: BinomForest 'False One a -> BinomHeap a
I've made the somewhat unusual choice to force Skip
and Cons
to produce 'False
rather than an arbitrary nullness to avoid needing to allocate copies of structures to change their types. Surprisingly few changes are necessary to support insert
and minView
. The "topped" queues with size and minimum stored separately will probably want to "unpack" the BinomHeap
sum type for compactness, inlining its functions.
I would recommend deferring the switch to a GADT until we get rid of Zero
; it's not really worth the overhaul on its own.
@konsumlamm Feedback on how the test stuff should be structured would be appreciated so I can make progress.
The test structure looks good to me. Perhaps it would be better to move the validity checking functions to the tests though, then we could also use .&&.
to get better error messages. It would also be nice to test that the queues are valid for more than just minView
(perhaps one could make a combinator that automatically adds a validity check), but that doesn't have to happen now.
If we want, we can actually enforce the shape invariant using a GADT.
That sounds good.
I've made the somewhat unusual choice to force
Skip
andCons
to produce'False
rather than an arbitrary nullness to avoid needing to allocate copies of structures to change their types.
Isn't nil
supposed to indicate wether or not the value is Nil
? Why would it make sense to have Skip
or Cons
be a BinomForest 'True rk a
?
I would recommend deferring the switch to a GADT until we get rid of
Zero
; it's not really worth the overhaul on its own.
Isn't the change just to add an additional type parameter to BinomForest
? Or am I missing something?
The test structure looks good to me. Perhaps it would be better to move the validity checking functions to the tests though, then we could also use
.&&.
to get better error messages. It would also be nice to test that the queues are valid for more than justminView
(perhaps one could make a combinator that automatically adds a validity check), but that doesn't have to happen now.If we want, we can actually enforce the shape invariant using a GADT.
That sounds good.
I've made the somewhat unusual choice to force
Skip
andCons
to produce'False
rather than an arbitrary nullness to avoid needing to allocate copies of structures to change their types.Isn't
nil
supposed to indicate wether or not the value isNil
? Why would it make sense to haveSkip
orCons
be aBinomForest 'True rk a
?I would recommend deferring the switch to a GADT until we get rid of
Zero
; it's not really worth the overhaul on its own.Isn't the change just to add an additional type parameter to
BinomForest
? Or am I missing something?
We'd need to deal with an existential at the top, which requires the same sort of fuss as getting rid of Zero
but with far less benefit. It would be somewhat less fuss if GHC knew how to unpack existentials, but I've made no headway in my efforts to convince GHC HQ to spend the time on that.
The usual advice from @pigworker is to make a GADT type tell you what you need, rather than what you have, which would lead to nullable
rather than nil
. But then we'd have to write a lot of things like
case x of
Skip f -> Skip f
Cons t f -> Cons t f
and it's not very nice. Yeah, it should go away in Cmm, but terms are big and I don't trust it and it's gross.
Then again ... maybe I'm wrong and we really want nullable
for merging. I'll play around a bit more.
Okay, trying to do it with nullable
is a big mess, as I predicted. Doing it with nil
(and dropping Zero
) is actually working out pretty well, despite the slight weirdness. Probably the oddest thing is the need for the type of merge
to include a type-level &&
, but ... it works. When GHC implements first-class existentials, we won't even need that. There is quite a bit of extra code to deal with the possible rank-zero tree at the front; that's pretty annoying, and I don't yet know how it'll affect performance. Anyway, this all should be tracked in some other ticket.
I just ran some benchmarks. Fortunately, k-way merge hasn't slowed down measurably, which I was slightly concerned about. Heapsort seems to have improved very slightly in general (not statistically significant). I'm not surprised that we don't see a bigger benefit there; the "tail" of Skip
s would have been mostly in normal form and mostly contiguous in memory, so walking it would've been very cheap, and in any case $log$ is a slow-growing function, and therefore also slow to shrink. I think we'd need to do something a bit weirder to measure a big impact, like building a huge queue, emptying it almost all the way, then turn on the timing and switching to alternating insertion with deletion. Anyway, it's good to get the bounds back to what they're theoretically supposed to be.
Shrink the queue spine in
extractBin
when its size drops from a power of two.Fixes #108