hdgarrood / purescript-sequences

Various efficient-ish sequence types for PureScript.
http://pursuit.purescript.org/packages/purescript-sequences
MIT License
45 stars 22 forks source link

Fast JS RRB-tree #31

Closed paldepind closed 5 years ago

paldepind commented 6 years ago

Hello @hdgarrood

I've implemented a highly optimized immutable JavaScript list/sequence that I think could be useful to the PureScript community. I'd love to get your feedback. In particular, I don't know what name to publish the package under.

The library is an implementation of a data-structure called relaxed radix balanced trees. RRB-trees have essentially the same complexities as finger tree and I've also experimented with creating a highly optimized finger tree implementation. But, I've found that RRB-trees offer significantly lower constant factors. I have a benchmark suite here that compares the performance of List to native arrays and other JS libraries. I also ran List with your benchmark suite and I've included the results below. As you can see the performance is very good :smile:

Since the JS library simply is called List the obvious name for the PureScript package would be purescript-list. But that is already the name of something else. Thus I'm wondering, what would be a good name to pick? As the graphs below show the performance of List is better than Data.Sequence. So one option would be to replace the current purescript-sequences with a wrapper around List. What do you think about that? Alternatively, is there some other name that would "make sense" and follow the module naming conventions?

Btw, @gabejohnson has created the PureScript bindings for List. He had the great idea to make the API identical to the Data.Array API. Therefore it can serve as a faster drop-in replacement for Data.Array.

Benchmark results: append insert-lots map fold filter traverse concat-map apply

hdgarrood commented 6 years ago

Hi @paldepind! Really fantastic work on this implementation, that is wonderful to see!

I think I am going to leave this library as-is for now, because I think it's useful that it doesn't use the FFI. This means that people compiling to languages other than JS can still use it.

However, having a nice set of bindings to this library would be fantastic! I realise it has taken me a really long time to respond so this might not be useful at this point, but perhaps purescript-rrb-list would be a good name for the library? I might put it under Data.RRBList? Or maybe Data.RRBArray, if the data type is called Array? It's up to you, of course.

gabejohnson commented 6 years ago

@hdgarrood link to the PR for bindings https://github.com/funkia/list/pull/27. Note that the existing docs are straight from -arrays, but it's not my intention to plagiarize and would love input on changes to make or what form attributions should take.

hdgarrood commented 6 years ago

Thanks for asking :) purescript-arrays is MIT licensed, right? I’d suggest mentioning which parts came from Data.Array in a readme somewhere, and I’d also suggest reading through the license text and making sure you’re doing everything it says in there if you haven’t already. If I remember correctly, with projects using the MIT license (and probably most other open source ones) you’re supposed to include the original license text in derivative works.

paldepind commented 6 years ago

Thank you for the reply @hdgarrood. Good point about this library not being reliant on JavaScript.

I think the name purescript-rrb-list is a very good suggestion. Another option is purescript-rrb-tree but I think that maybe having "tree" in the name could be confusing when the library actually implements a list/sequence?

hdgarrood commented 6 years ago

Right, yeah. If it was up to me I’d probably prefer rrb-list over rrb-tree since the fact that it’s really a tree behind the scenes is mostly just an implementation detail?