kowainik / slist

♾️ Sized list
https://kowainik.github.io/projects/slist
Mozilla Public License 2.0
46 stars 6 forks source link

[RFC] Sized Difference List #33

Open chshersh opened 4 years ago

chshersh commented 4 years ago

Because of how ordinary lists are implemented in Haskell, left-associative appending of such lists is very slow. The following takes much more time than it should:

((((l1 ++ l2) ++ l3) ++ l4) ++ ...)

Optimal appending order is the following:

l1 ++ (l2 ++ (l3 ++ (l4 ++ ...))))

There's a trick called Difference List which automatically rearranges list appending to get an optimal order. The trick is representing a list in a different way: instead of storing [a], you store [a] -> [a]. It's implemented in the following packages: https://hackage.haskell.org/package/dlist

This technique is very useful if you don't have control of how lists are appending (for example, when appending inside some recursive call).

I thought, that maybe slist package could provide a data type like SDList (similar to DList) — sized difference list to optimize appending. The problem with DList is since it represents the list as a function, you can't get any info from this function for free. With SDList you should be able to store size and get at list size, so I'm thinking about data type like this:

data SDList a = SDList Size (Slist a -> Slist a)

Do you think it's worth having such a structure in the library? I guess you don't need to implement bazillion functions for this structure at first, it will be enough to have a few basic functions, because mostly you're interested in appending such difference lists and converting from and to Slist.

vrom911 commented 4 years ago

That sounds like a fun ide, @chshersh a! I don't mind to experiment on that and see how it can come handy.

Slist is not that widely used and it is completely fine to add such interesting and very related structures to make it useful for more use-cases.

chshersh commented 3 years ago

I'm now thinking, that probably the following type would be better:

data SDList a = SDList Size ([a] -> [a])

We keep size as a separate field, and append plain lists inside.

And then will have Semigroup instance for SDList and a function of type Slist a -> SDList a