Closed UnkindPartition closed 8 years ago
It should be possible to reproduce any sharing conundrum that you could make with Data.List. Thus all of these exhaust memory except the one that uses streaming copy:
import qualified Streaming as S
import qualified Streaming.Prelude as S
import Pipes
import qualified Pipes.Prelude as P
import qualified Conduit as C
import qualified Data.Conduit.List as C
import Control.Monad
import Control.Concurrent (threadDelay)
x = 50000000 :: Int
main = stream'
rep :: Monad f => (Int -> f a) -> f ()
rep act = loop 1 where
loop n = do
when (n < x) $ do
act n
loop (n+1)
ns :: [Int]
ns = [1..x]
ss :: Monad m => S.Stream (S.Of Int) m ()
ss = rep S.yield
ps :: Monad m => Producer Int m ()
ps = rep yield
cs :: Monad m => C.ConduitM i Int m ()
cs = rep C.yield
list = do
print $ sum ns
threadDelay 1000000
print $ product ns
conduit = do
n <- cs C.$$ C.sumC
print n
threadDelay 1000000
n' <- cs C.$$ C.productC
print n'
pipes = do
n <- P.sum ps
print n
threadDelay 1000000
n' <- P.product ps
print n'
stream = do
n <- S.sum ss
print n
threadDelay 1000000
n' <- S.product ss
print n'
stream' = do
n <- S.product $ S.sum $ S.copy ss
threadDelay 1000000
print n
I don't follow.
You're showing a different example -- one where you use a stream twice, for two different computations.
In my example, I use the stream once to produce an IO action. That IO action is later executed twice.
I am not sure how your copy
solution applies to my case.
Sorry, the bit about copy
was distracting, then; the comparison with Data.List
and the streaming libraries was the point. In
main = do
let io_action = mapM_ print $ take n $ [0..]
io_action
io_action
ghc thinks it might need the list [0..]
again. There are various ways of keeping it from getting this impression. My thought was that nothing will protect you from sharing curiosities as long as you are using ghc/haskell which must hunt for sharing opportunities. Do you think that something is wrong with conduit, because it produces the difficulty above? It is using a codensity representation, but ghc still thinks it has to preserve its calculation.
Because the representation of S.enumFrom 0
, even when specialized to IO
, is very close to that of [0..]
(in that it doesn't itself involve any monadic steps) -- it differs only by throwing in two constructors for each number emitted, where the list has only :
-- ghc will be especially prone to thinking it can re-use the results it is calculating. In the case in question you were able to break this by inserting monadic steps, but I don't think it is possible generally, as we see from the conduit/pipes/list cases which as you say "use the stream once to define an io action"
Yes, I understand why this is happening, and I didn't want to imply that the problem is somehow specific to this library.
I was just curious if you knew any better workarounds than the one I came up with.
I also have another way to fix this minimal example:
{-# OPTIONS_GHC -fno-full-laziness #-}
import qualified Streaming as S
import qualified Streaming.Prelude as S
a :: () -> IO ()
a () = S.mapM_ print $ S.take 500000 $ S.enumFrom 0
main :: IO ()
main = a () >> a ()
However, in my real code base, the point where S.mapM_
is called is deep down the call stack from where the action is duplicated, and transforming all actions in between to take a dummy argument is not very practical.
@nomeata I know you've done a ton of work in this area. Do you have any thoughts? Is there a chance to get your dup
function into ghc?
For a while, I implemented the three principal offenders, each
yield
and enumFrom
more or less as you did myenumfrom
, interpolating a monadic layer, but this of course slows things down a bit; then I thought of adding a yieldM
eachM
etc., defined like myenumfrom
, but the whole device only gets rid of a range of cases. So in the end I decided that the most raw representation would make it easier to reason and anticipate them: something like as = S.each [0..]
really is very close to as = [0..]
, just more corpulent. A device like delaying with a ()
argument works in some cases, not in others. Insertion of monadic layers is like this. Pipes is putting all but the first element of the stream behind successive () -> ...
but still generates the pathology above. The conduit case is even more remarkable.
The slowdown (which is just a matter of piling on more constructors) was only visible in trivial benchmarky cases, so maybe it's worth it.
Is there a chance to get your dup function into ghc?
Unlikely; it was very fiddly and low-level, and probably is not reliable either.
Unfortunately, I don’t know of any better solution. Even tricks like an ()
argument can possibly fail to work as expected if GHC is too smart. And all the other tricks (monadic actions) just work because they obscure the code to the compiler. If the compiler gets smarter, it will fail again.
We might need pragmas like inline this
or fuse this really hard
to guide the compiler, but again, it is not clear how that would look like precisely.
Working through some of the recalcitrant cases, I did find that -fno-full-laziness
seems to work wonders except where there is a clear binding to a 'pure' stream/pipe/conduit. For example, it much more reliably falls for the () argument trick, if I am understanding my evidence properly - but also if the sameness of stream is otherwise buried. So one prophylactic might be to recommend use of it when compiling main. Strangely, feuerbach's original program
import qualified Streaming as S
import qualified Streaming.Prelude as S
n = 5000000
main = a >> a
a = S.mapM_ print $ S.take n $ S.enumFrom (0::Int)
is not finding any unintended sharing. But I'm sure I observed it yesterday. This suggests to me that the compiler has a learning ability, which wouldn't too much surprise me.
Even tricks like an
()
argument can possibly fail to work as expected if GHC is too smart.
I would be very interested in any cases you find where no-full-laziness
plus ()
argument fails. Please forward any such cases you find to me!
@tomjaguarpaw see my original snippet in the very first post
@feuerbach The original snippet doesn't have ()
argument ... ?
I think data OfNoSharing a b = OfNoSharing a (() -> b)
could be useful for preventing sharing.
@tomjaguarpaw it doesn't, but simply adding a ()
argument to a
didn't help last I checked.
Your OfNoSharing
could indeed work (I haven't checked), but it's rather different from the standard mechanical transformation of adding a ()
argument to the thing you don't want to share, which I thought you were referring to.
simply adding a () argument to a didn't help last I checked
Right, which is why I am interested in
any cases you find where
no-full-laziness
plus()
argument fails.
i.e. do you know of any space leaks where both
no-full-laziness
is enabled, andIf so, as I said, I would be very interested. It is my current hypothesis that if both conditions are satisfied then the thunk will not be shared.
it's rather different from the standard mechanical transformation of adding a
()
argument to the thing you don't want to share, which I thought you were referring to.
They are indeed different, and I'm suggesting that each would work.
Weird, I can't reproduce it anymore. I'm reasonably sure no-full-laziness and dummy arguments didn't help in my real problem at work. I'll try to make a minimal example when I have time.
That's what they all* say :)
@tomjaguarpaw that's an interesting discussion on haskell-cafe. I wondered about something like data OfNoSharing a b = OfNoSharing a (() -> b)
That is effectively what Pipes is using when specialized to Producer
. I labored quite a bit convincing myself that a left-strict pair was better than a lazy one, but never tested a more-than-just-right-lazy pair. I think the existing scheme explains the cases where this library is faster than pipes, which is a lot of the easily comparable cases, so the results might be unpleasant. Another possibility would be to change the (hidden) constructors to
data Stream f m r = Return r | Effect (m (Stream f m r)) | Step (() -> f (Stream f m r))
That might be somewhat more practical. Then it could just be another hidden feature the user wouldn't bump into much, since one could still have essential functions like wrap :: f (Stream f m r) -> Stream f m r
and so on. I made the existing Step (!(f (Stream f m r)))
constructor strict, because it was seeming to win in lots of cases, but I think there was never any considerable difference.
The no-sharing pair would also be kind of a drag since, at least the way I was thinking, it is important for comprehension that the folds in e.g. Streaming.Prelude
should be of the form
sum :: Stream (Of Int) m r -> m (Of Int r)
and the like, so that they can be used for the characteristic rank-2 maps, but they also need to be practical taken by themselves. Construction and deconstruction would be less pleasant. But I don't know. All of this presupposes that generally counseling fno-full-laziness
makes sense, though I suppose it would eliminate some mishaps anyway.
I'm on a bit of a crusade to eliminate full-laziness
, or at least replace it with something more subtle: https://stackoverflow.com/questions/35115172/why-is-no-full-laziness-a-default-optimization/35115664#35115664
I would be interested to see how performance changes under OfNoSharing a (() -> b)
if and when you get results.
I don't understand why the form of sum
would change. Can you explain?
The type signature would be the same, it's just that that that's a place where the user might actually pattern match, say
loop str = do
(n:> rest) <- lift $ S.sum $ S.splitAt 3 str
when (n < (10::Int)) $ do
S.yield n
loop rest
which then becomes
...
loop (rest ())
which isn't bad, just more to keep track of.
Will the CONLIKE
pragma help here?
You mean marking something like enumFrom
with {-# INLINE CONLIKE enumFrom #-}
or {-# INLINE CONLIKE[1] enumFrom #-}
or whatever, in its source, I take it? It doesn't affect the test we were contemplating, though maybe I don't know enough about how to use it. I wonder how it would affect optimization in the intuitively correct case where I don't unreasonably bind a name to the result; optimization is the main thing I am worried about.
Here's the source I'm using - toggling between stream
, list
, etc. - minus the distraction that caused trouble above.
module Main (main) where
import qualified Streaming.Prelude as S
import Streaming (Of(..))
import Pipes
import qualified Pipes.Prelude as P
import qualified Conduit as C
import qualified Data.Conduit.List as C
import Control.Monad
import Control.Concurrent (threadDelay)
x = 50000000 :: Int
main = stream
rep :: Monad f => (Int -> f a) -> f ()
rep act = loop 1 where
loop n = do
when (n < x) $ do
act n
loop (n+1)
ns :: [Int]
ns = [1..x]
ss :: S.Stream (Of Int) ()
ss = rep S.yield
ps :: Monad m => Producer Int m ()
ps = rep yield
cs :: Monad m => C.ConduitM i Int m ()
cs = rep C.yield
list = do
print $ sum ns
threadDelay 1000000
print $ product ns
conduit = do
n <- cs C.$$ C.sumC
print n
threadDelay 1000000
n' <- cs C.$$ C.productC
print n'
pipes = do
n <- P.sum ps
print n
threadDelay 1000000
n' <- P.product ps
print n'
stream = do
n <- S.sum ss
print n
threadDelay 1000000
n' <- S.fold (+) 0 id ss
print n'
@tomjaguarpaw, michaelt is no longer active on GitHub or the Haskell community, unfortunately. Development of this package has moved to https://github.com/haskell-streaming/streaming. If you'd like to reopen discussion, I suggest you do so in a new issue there, and link to this one.
Thanks @treeowl. Done!
This simple code leaks memory because the stream gets shared:
The only workaround I have found is to introduce dummy monadic layers:
Are there any better approaches?