Adding a Seq type to Base?

strega-nil commented 6 years ago

I've just rewritten my compiler to use Base, instead of my own standard library, and the only major missing thing was something equivalent to seq. Would it be at all possible to add something like this to Base? I'd be willing to write up an implementation, I'm just curious if anybody else wants this.

Here's where I would use it.

rgrinberg commented 6 years ago

Base already has Sequence. Which is regretfully, superior to what's going to be added to the stdlib. Have you taken a look at it?

strega-nil commented 6 years ago

@rgrinberg Sequence is more complex than Seq, for no discernable purpose to me. You're basically implementing function closure erasure with the 's type; is there a reason to do this?

However, I also thought that Sequence was more similar to Streams, so thanks for your response, and I'll start using them.

c-cube commented 6 years ago

The previous stdlib proposal, at https://github.com/ocaml/ocaml/pull/635 , tried to use the same type as Core.Sequence.t but it seems people at JST are still not sure what's exactly the best type for sequences.

I'm also wondering why Base.Sequence is intrinsically better than the suspension list? Closures are super nice to compose, and keep most operations really straightforward to write. Is there something fundamental in Flambda that will never make it possible to optimize suspension lists as thoroughly as base.Sequence? (cc @mshinwell)

rgrinberg commented 6 years ago

The previous stdlib proposal, at ocaml/ocaml#635 , tried to use the same type as Core.Sequence.t but it seems people at JST are still not sure what's exactly the best type for sequences.

Why is this an issue if the sequence type stays abstract? The proposed improvements were, while interesting, not really essential.

By the way, the type itself shouldn't be called as a "janestreet sequence", it's been well known of in the haskell community: http://fun.cs.tufts.edu/stream-fusion.pdf (just to dispel bias).

Closures are super nice to compose, and keep most operations really straightforward to write.

From the paper I linked, streams allow you to avoid recursion in implementing many primitive combinators. Also I think that many operations are much simpler to implement using Sequence, as they are inherently stateful. Let's compare implementations of things like append, zip, unfold for example. Finally, I think that it's trivial to write a generator on top of Sequences (you basically need the Step type already). With Seq, I'm sure it can be done but it seems far harder and you still need a Step type.

Another reason why I like the stream fusion sequence is that it performed very well in this comparison: https://github.com/rizo/streams-bench Basically outdoing all other purely functional implementations. I realize that this comparison excludes Lazy Lists however. And I'm sure you done plenty of your own benchmarking.

c-cube commented 6 years ago

I get a 404 on your url :-/ The abstract type thing is important imho, because a standard iterator type ought to be a common exchange format for all the stdlibs we have (a kind of protocol for abstracting sequences of elements). If the type is abstract it means that the library where it's defined (base-seq?) can implement some combinators the other libraries can't.

Lazy lists are usually super slow, but my own benchmarks (which involve some map, flat_map, filter in addition to just fold) put sequence far ahead.

rgrinberg commented 6 years ago

What about this: http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.104.7401

About abstract types:

Everything should be doable using functions alone. The only difference is performance. The map and hash table implementation in stdlib is also abstract, and yet nobody complains.

By the way, if it was up to me then making Step.t and friends concrete would be the first choice. The type itself is as old as dirt. Yes, there are some tweaks on top of it, but those are the kind of things that base/containers should focus on. The stdlib doesn't need to micro optimize this use case. Just standardize on something proven in practice and functional.

rgrinberg commented 6 years ago

Btw, how would you implement Sequence.Generator on top of your lazy list? It's a killer application and it's probably possible using lazy lists but I'd hate to implement it.

c-cube commented 6 years ago

What's Sequence.Generator?

edit: ah, I see. I drafted something like that recently, I don't know, it looks like a continuation monad would do the trick? suspension lists are basically continuations… And code with suspension lists is usually quite simple, see oseq.

rgrinberg commented 6 years ago

Let's you do stuff like this easily:

open Base

let rec nums x y =
  let open Sequence.Generator in
  if x > 100 || y < 0 then (
    return ()
  ) else (
    yield (x + y) >>= fun () ->
    yield (x - y) >>= fun () ->
    nums (x + 1) (y - 1)
  )

let seq : int Sequence.t = Sequence.Generator.run (nums 50 50)

Basically, a python like generator API that relies on Monads to work. I did write about it here before http://rgrinberg.com/posts/monadic-generators-ocaml/ ;)

c-cube commented 6 years ago

The map and hash table implementation in stdlib is also abstract, and yet nobody complains.

I do complain (also about Buffer.t) precisely because I can't extend them easily, which is why I need to modify the stdlib to add iterators that are not Sequence.t. But iterators as a common exchange format need to be more extensible anyway.

c-cube commented 6 years ago

https://github.com/c-cube/oseq/blob/e2bb9676412684bbb944bd223a30a178ce232247/src/OSeq.mli#L370 Generator can be done, indeed!

yminsky commented 6 years ago

For what it's worth, I think avoiding abstractions for this kind of code is a mistake. If libraries are hard to extend in practice, we should do something to improve the development process, either by making things easier to extend in the compiler, or making it easier to rely on things not in the compiler. But abstraction is a fundamental software engineering tool that we should not give up lightly.

yminsky commented 6 years ago

Also, can we get back to the original request? Given the existence of Sequence, is there in fact an open request for something to add here?

strega-nil commented 6 years ago

@yminsky I don't believe so. I guess it's now a request for better usage docs for Base, instead of API docs :P

janestreet / base

Adding a Seq type to Base? #41