Open kvanbere opened 10 years ago
I'm personally a little bit sus about all the closures being used and such.
Just so other people now, right now most of my optimization work is being poured into pipes-parse
, which is the current performance bottle-neck. The issue is that FreeT
makes binds in the base monad obligatory, and I'm working on a replacement type that will make the bind optional (at the slight cost of making the monad transformer laws slightly incorrect). The type I'm looking at to replace FreeT
is:
data List f m r = Impure (m (List f m r)) | Cons (f (List f m r)) | Nil r
I'm actually planning on going one step further and replacing the use of Producer
s within pipes-parse
with the following type:
data Of a r = a `Onto` r
-- example type signature I'm aiming for
lines :: List (Of String) IO r -> List (List (Of String IO) IO r
Here's an example of what code written using List
would look like:
zip :: Monad m => List (Of a) m r -> List (Of b) m r -> List (Of (a, b)) m r
zip l1 l2 = case l1 of
Nil r -> Nil r
Impure io -> Impure (liftM (`zip` l2) io)
Cons (a `Onto` l1') -> zipR l2
where
zipR l2 = case l2 of
Nil r -> Nil r
Impure io -> Impure (liftM zipR io)
Cons (b `Onto` l2') -> Cons ((a, b) `Onto` zip l1' l2')
There would be an isomorphism from List (Of a) m r
to Producer a m r
for compatibility with pipes
, and this would play nicely with the upcoming lens
support.
Note that the reason I'm switching to List
is not only performance, but to also simplify the writing of lenses for the upcoming lens-based parsing approach. Right now my observation is that writing new lenses is the difficult part of the new pipes-parse
API and the List
type greatly simplifies this process.
Also, the use of closures is primarily for two reasons:
stdinLn
from Pipes.Prelude
). This is because back in the old days of pipes
there would be top-level recursive functions and they would sometimes trigger bizarre corner cases that would cause space leaks. Making a trivial call to a recursive function (i.e. go
) would fix this problem. This is why none of the top-level functions are directly recursive.Cheers! When I was talking about closures I was more talking about being sus about the dispatch performance for read/write for pipes-concurrency, since they use PAPs heavily.
Keep up the good work.
If you want to know where might be room for performance improvement, I can think of three main areas:
lift
. This is tricky and the only approach I know of requires very sophisticated rewrite rules using the monad transformer laws, including rules for functions defined in Control.Monad
, which is pretty sketchy.(>->)
) when it cannot be transformed to use shortcut fusion. This might require playing with closures or INLINE
/INLINABLE
annotations.pipes-bytestring
when dealing with small chunks, especially when writing to a handle. Right now, stdout
does not buffer bytestring chunks so if you do a manipulation that splits the bytestring into lots of tiny pieces you get tons of small writes which kills performance. There are other operations which also suffer from shredding the bytestring, but this is the most notable one. However, this optimization should probably wait until after changing pipes-parse
.
There's a lot of new exciting things being added in at the moment, which will make this library the BEST way to setup a server so far!
When development settles down and the API becomes concrete, to complement the excellent performance of pipes, this library should be benchmarked/micro-optimised and profiled to make sure it's server-ready (and catch deadlocks or loopholes that may have been missed in the process).
This is a ticket to hold performance notes and ideas.