`dunai`: MSF arrows aren't associative in terms of evaluation

ivanperez-keera / dunai

Classic FRP, Arrowized FRP, Reactive Programming, and Stream Programming, all via Monadic Stream Functions

210 stars 35 forks source link

`dunai`: MSF arrows aren't associative in terms of evaluation #245

Closed walseb closed 1 month ago

walseb commented 3 years ago

Hi! I was re-reading the old issue @turion helped me with (#236) and noticed that he mentioned the issue with MSF compose strictness way before I eventually figured out that was the problem, I think I just didn't make the connection. But more relevant to this issue, I was reminded of the problem I encountered there with MSF arrows not being associative. Back then I was more concerned with just getting my yampa stuff working in bearriver, but today I decided to give it a go.

I have managed to reduce it down to this:

main = reactimate (return ()) (\ _ -> return (0.1, Just ())) (\_ b -> print b >> pure False) sfB -- sfB

-- Runs
sfA :: SF () ()
sfA = constant undefined >>> identity >>> constant ()

-- Doesn't run
sfB :: SF () ()
sfB = (constant undefined >>> identity) >>> constant ()

In sfA the output of constant undefined is never evaluated as it isn't needed to create a result. In sfB the output of constant undefined is reduced to WHNF which results in undefined being thrown. But in both examples the output of constant undefined isn't actually needed to get the result so they shouldn't be evaluated even to WHNF. If we put the undefined into a data structure so it can't be fully evaluated with just WHNF we can see that this is indeed a evaluation problem:

-- Runs despite same parenthesis as sfB
sfC :: SF () ()
sfC = (constant (undefined, undefined) >>> identity) >>> constant ()

Beyond this I haven't made any progress yet. For easy reference here are the relevant functions:

constant = arr . const
identity = arr . id

f >>> g = g . f
sf2 . sf1 = MSF $ \a -> do
  (b, sf1') <- unMSF sf1 a
  (c, sf2') <- unMSF sf2 b
  let sf' = sf2' . sf1'
  c `seq` return (c, sf')

ivanperez-keera commented 3 years ago

Good catch.

This problem seems to clearly stem from the use of seq.

Two observations:

If needed I think we could provide a seqA if we don't have one already.
I wonder if it's possible to always move the seq out of the definition of (.). I think it is.

ivanperez-keera commented 3 years ago

@walseb Can you please check what happens if you remove the seq from the definition of (.) and try again?

walseb commented 3 years ago

Yeah it works then and I haven't observed any problems with removing seq. I'm not sure what the consequences of changing something this fundamental would be in terms of performance tough. But the way it is currently this issue can cause mysterious crashes to occur with rec when you haven't defined the value at time 0. To give a real-world example:

Crashes:

sf = proc _ -> do
  rec
    foo <- accumHoldBy (+) (1 :: Float) -< Event bar
    bar <- iPre 1 -< 1
  returnA -< foo

Doesn't crash

sf = proc _ -> do
  rec
    -- Notice the extra iPre, you could probably also use (-->)
    foo <- iPre 1 <<< accumHoldBy (+) (1 :: Float) -< Event bar
    bar <- iPre 1 -< 1
  returnA -< foo

In yampa the first one doesn't crash which caused me further confusion. It's also strange how it crashes despite me initializing accumHoldBy with 1

blucoat commented 3 years ago

* I wonder if it's possible to always move the `seq` out of the definition of `(.)`. I think it is.

I don't see why it can't be. The result of unMSF (sf2 . sf1) a is still strict in the results of unMSF sf1 a and unMSF sf2 b, as a result of pattern-matching, just not strict in the actual values yielded by each MSF. I don't think it should be strict in the output values either, unless this principal were applied consistently to all combinators, like arr, so that MSF can obey the arrow laws.

The upshot if this is that if an MSF has some accumulator that it wants to be forced on each tick, it needs to make sure the (output, continuation) pair it returns depends on that state; it's not sufficient to make the output alone depend on the state, since the output might not be forced on every tick. The feedback combinator already facilitates this, since it pattern-matches on the (output, state) pair produced by the inner MSF before returning anything. Other combinators, like accumulateWith do not. It could, if it were changed from

accumulateWith f s0 = feedback s0 $ arr g
  where
    g (a, s) = let s' = f a s in (s', s')

accumulateWith' f s0 = feedback s0 $ arr g
  where
    g (a, s) = let s' = f a s in s' `seq` (s', s')

Other users of feedback might need to make similar changes as well. The semantics aren't exactly the same as before, but IMO this is more consistent.

freckletonj commented 2 years ago

I was just bitten by this same thing. Essentially I was building a loop, and an arrow within would fail, depending on associativity:

(f >>> g) >>> iPre 0  -- infinite loop
f >>> (g >>> iPre 0)  -- works

I had a helluva time debugging that, but, good to know you all are aware of it :D

ivanperez-keera commented 2 years ago

So:

This definitely is an issue and we need to do something.
Applying seq externally to the output might not always be equivalent. The reason is that seq-ing the output of (f . g) won't seq the intermediate value if f is lazy in its argument (but it will seq the tuple of the result and the continuation in order to run it.

But I think, fundamentally, it's probably better to tell people that they need to make MSFs strict in their arguments if they don't want values to accumulate. That gives them both the control and the responsibility.

So, if we can always get what we had by just adding seqs to MSFs we (as dunai users) compose, then I think the best course of action is to remove that seq from the implementation of (.).

freckletonj commented 2 years ago

out of curiousity, is it possible to execute looping streams without needing delays? I think not, and therefore I struggle to understand why we have loop and not just feedback? Is it because there are times where you need fine-grain control over what's being delayed? I'm just thinking, if we only used feedback, we'd never run into these issues, but, I'm still a bit new to this world of streams and arrows.

turion commented 2 years ago

out of curiousity, is it possible to execute looping streams without needing delays?

Usually not, but there could be special cases where e.g. some type or some monad is sufficiently lazy so that it might work. But definitely not in general.

I think not, and therefore I struggle to understand why we have loop and not just feedback?

I was wondering about that myself as well many times in the past. Basically, feedback is just loop and a delay at a specific position (I believe at the input).

Is it because there are times where you need fine-grain control over what's being delayed?

Probably. It definitely gives you a finer control. But I never was in a position where I strictly needed them. I think there are some situations where syntax and performance are improved over feedback because otherwise e.g. you'd need to wire some variable around in a strange way, but the semantics you can express with feedback are the same as with loop and delays.

As another reason, there are sometimes library functions that already introduce delays. Using them together with feedback will introduce further delays. I had a situation in a library with a similar API where I had delays in a sum function or an integral (https://github.com/turion/essence-of-live-coding-tutorial/blob/f714bcdba85f5db19d64f3480be6a8215b0f8245/Main.hs#L131) and feedback would have introduced extra delays. Instead of reimplementing a non-delaying sum and integral, I just left the library simple and went for rec. Maybe not the best solution one should teach, but the simplest solution for a presentation.

Plus, the MonadFix and ArrowLoop classes were known before MSFs, so it made sense to reference them.

I'm just thinking, if we only used feedback, we'd never run into these issues, but, I'm still a bit new to this world of streams and arrows.

Yes, that's true, and that's probably also what we should teach.

ivanperez-keera commented 2 years ago

out of curiousity, is it possible to execute looping streams without needing delays? I think not, and therefore I struggle to understand why we have loop and not just feedback? Is it because there are times where you need fine-grain control over what's being delayed? I'm just thinking, if we only used feedback, we'd never run into these issues, but, I'm still a bit new to this world of streams and arrows.

I more or less agree with this point of view. Loop was present in signal processing languages like Lustre. Implementations like Yampa also inherited this construct. The fact that a standard class like ArrowLoop exists and is supported by the compiler explicitly has added to the expectation of having an implementation of loop as well.

For me, this has been a point of disagreement with Henrik Nilsson (one of the creators of Yampa). I have expressed, again and again, that for me it is almost never the case that introducing a delay in loop, when needed, is easier than just using loopPre in Yampa or feedback in dunai.

That being said, I have found the use of loop very useful to write physics equations in Haskell, and that is a strong point in favor of not dropping support altogether. But like in prior occasions, there's nothing wrong with defining this in a module (even within one's own code that uses dunai). Dunai is designed to be extensible and extended, after all.

There are many possible ways we could address this:

Add a warning to the implementation of loop. (Question: would it be shown to users when their arrow notation constructs desugar to use loop?)
Remove the import of the instance from Data.MonadicStreamFunction.Core and any modules imported by default.
Add to the documentation of the ArrowLoop instance. (This would not help if users write expressions in arrow notation that desugar to use loop but they do not know it.)
Write further documentation that is likely to be found easily when people experience issues (e.g., infinite loops).

geroldmeisinger commented 2 years ago

Another example which might be related:

git clone git@gitlab.com:gerold.meisinger/yampa-book.git
cd yampa-book
cabal repl textfield1
$ main

https://gitlab.com/gerold.meisinger/yampa-book/-/raw/main/src/textfield1.hs

textfield :: String -> SF Identity (KeyPressed, Event String) Textfield
textfield textInit = proc (keyPress, setText) -> do
  let
    backE  = filterE (== keyBack ) keyPress
    leftE  = filterE (== keyLeft ) keyPress
    rightE = filterE (== keyRight) keyPress
    charE  = if isNoEvent $ mergeEvents [backE, leftE, rightE] then keyPress else NoEvent

  rec
    let handleBack = backE `tag` (if cursorPos > 0 then removeAt text cursorPos else text)
        handleChar = charE <&> insertAt text cursorPos
        limitPos p = min (length textNew) . max 0 $ p
    textNew   <- hold textInit -< mergeEvents [setText, handleBack, handleChar]
    text      <- iPre textInit -< textNew
    cursorPos <- iPre posInit  <<< hold posInit  -< mergeEvents
      [ setText `tag`  cursorPos
      , backE   `tag` (cursorPos - 1)
      , leftE   `tag` (cursorPos - 1)
      , rightE  `tag` (cursorPos + 1)
      , charE   `tag` (cursorPos + 1)
      ] <&> limitPos

  cursorFrame <- animate cursorFrames 5.0 -< ()
  returnA -< (text, cursorPos, cursorFrame)
  where
    posInit = length textInit

Entering characters in the textfield only works when seqwas removed, otherwise I get a Exception: <<loop>>after a while.

ivanperez-keera commented 2 years ago

That's very useful! I'm really happy that you found this example.