Closed istathar closed 6 years ago
Let me try to recall and summarize the events that led to this, and it will later help me form a better statement about this:
asyncly
to address all these issues and to create a library that could be used widely in any Haskell code and had concurrent composition that can be extended to distributed processing as well. The design was pretty much opposite or dual to transient in a general sense.It looks like a simple progression after the fact but there were many hard problems on the way, and I spent a lot of time solving those to arrive at this final simplicity. The problems were in two categories, how can all this be done with high performance, and how can we simplify the abstractions so that they are very much generalized, principled and widely applicable preferably fitting into known and widely used abstractions. There are a number of problems still to be solved, it has a long way to go. This has a potential to become a simple yet generalized programming framework, a natural concurrency/distributed processing extension fitting into standard applicative and monadic abstractions for all your programming needs.
Thanks for your comments. Since I filed this I found the discussion at the end of the tutorial in Streamly.Tutorial, which does indeed go into much depth. My suggestion then is just to pull this into its own section or file somewhere (a Motivation heading in README.md, or perhaps a HISTORY.md)
On Reddit @chrisdone asked about [constant] space performance (as opposed to time performance). I would agree with you that those tend to go hand-in-hand in Haskell, but it might be worth speaking to that near the top of your README as well. Showing that in a benchmark is hard(er)?
Good luck!
Ah, I even forgot that I wrote something at the end of the tutorial. Yeah, I can pull that in README or maybe a small section in readme pointing to a separate file.
Showing GHC allocations in a benchmark is not hard. In fact, I was planning on that, its as easy as showing time. I have all the raw numbers, just need to crank the graphs. The good thing is all this is automated, I just need to adapt the code to space numbers instead of time. just need some time.
@afcowie I updated the readme with more details regarding how it relates to other frameworks and where it fits into the ecosystem. I also added a pointer to the comparison section in the tutorial and a pointer to a discussion on the types of streaming libraries on the streaming-benchmarks
page.
Let me know if this is enough so that I can close this issue.
I would like to encourage you to further explaining how this library came to be. Streaming I/O has been a well explored space in Haskell (in response to a very knotty problem) with iteratees and enumerator originally followed by conduit (which solved one set of problems to do with finalizers and space leaks) pipes (which was built by construction to solve others and also derived in a categorical way) and io-streams (which concentrated on IO and performance for webservers). All three converged on similar internal representations. There have been a few fringe contributions since then, quiver and streaming, both of which looking at the Functor side of things rather than the Monad [Transformer] one. Finally there are machines and transient which are special kinds of crazy.
Given this rich history, it's natural for someone encountering this for the first time to wonder where streamly fits in. It seems you got here in a somewhat ad-hoc fashion but clearly you know what you're about and have put an extraordinary amount of work into this library and benchmarking.
You described some of your motivations in https://github.com/composewell/streamly/issues/2#issuecomment-345538441; I would encourage you to lift that up to at a paragraph, perhaps at the conclusion of your top-level README.
I think it's incredibly cool that you have fairly transparently muxed threaded concurrency and streaming I/O. I look forward to trying this!
AfC