andrewthad / linear-containers

Containers supporting in-place modification
BSD 3-Clause "New" or "Revised" License
5 stars 1 forks source link

Implement maps and sets as B+ Trees #4

Open andrewthad opened 5 years ago

andrewthad commented 5 years ago

In containers, the Map and Set data types are implemented as balanced binary trees. For the persistent setting, this is certainly the right option since it maximizes sharing. However, in the non-persistent linear setting, we have more flexibility. Here, we can mutate trees in-place. I suspect that a B+ Tree with a degree between 8 and 64 should perform well for many applications. This will take a while to implement. It requires building a primitive linear array type that works much like the Reference type.

aljce commented 5 years ago

i think the size of the btree should be a type level natural. The reason for this is Haskell is good at loop unrolling type level natural things. 2-3 Trees are just b trees of Knuth order 3. This might allow everything to get unrolled into a 2-3 tree when n = 3.

chessai commented 5 years ago

What would this primitive linear array type look like?

andrewthad commented 5 years ago

Unfortunately, there are two different flavors of a linear array. Variant A is simple. It's a fixed-size array. You can modify elements. Variant B is something that you can insert into, shifting everything to the right over by one element. You can only insert up to a maximum size (specified at initialization). After this, inserts result in undefined behavior. So there are a certain number of currently valid elements, and after a certain index, everything is trash.

*** Variant A

data Array :: (Mode -> Type) -> Mode -> Type
modify :: Object f => Array f m ->. Int -> (f Dynamic ->. f Dynamic) ->. Array f m
destroy :: Object f => Array f Dynamic ->. Token
read :: Object f => Array f Static ->. Int -> (Token, f Static)

There are some gaps to fill in (like having statically and dynamically), and the function to create arrays would be kind of weird since we cannot perform ordinary replication-style initialization, but that's the gist of it.

*** Variant B

data Array :: (Mode -> Type) -> Mode -> Type
modify :: Object f => Array f m ->. Int -> (f Dynamic ->. f Dynamic) ->. Array f m
insert :: Object f => Array f Dynamic ->. Int -> f Dynamic ->. Array f Dynamic
delete :: Object f => Array f Dynamic ->. Int -> (f Dynamic, Array f Dynamic)
size :: Array f m ->. (Int, Array f m)

Again, just a sketch, but insert and delete are crucial. I haven't thought hard about where Static and Dynamic are appropriate on this either.