Mechanism to make recursive monadic functions not loop

developedby commented 6 months ago

Currently, monadic function have a problem that if they're recursive and the recursive call depends on a variable coming from an earlier bind operations, they will loop and go OoM.

def Parser/foo:
  with Parser:
   a <- Parser/fn_a
   b <- Parser/fn_b
   c <- Parser/foo(a, b)
   return wrap(c)

This example will be desugared to

(Parser/bind Parser/fn_a @a (Parser/bind Parser/fn_b @b (Parser/bind (Parser/foo a b) @c (Parser/wrap c))))

Since the recursive call (Parser/foo a b) is in active position, this function will loop indefinitely.

We'd like to have some mechanism to automatically convert ask (<-) terms to make sure that the arguments to the bind function are combinators and that the free vars are passed lazily by the bind.

Previously we tried doing the same thing we do for pattern matching, but it failed at anything that didn't directly return some variation of (nxt val) (for bind with signature (val: Monad a) -> (nxt: a -> Monad b) -> Monad b) The previous transformation was

(bind Val @x (nxt x free1 ... freen))
# free vars are lifted and passed as extra arguments to `bind`
(bind Val @x @free1 ... @freen (nxt x free1 ... freen) free1 ... freen)

This works for Maybe and Either, but failed at IO which is the main motivation behind this syntax, so this tranformation was removed.

I think I have a solution that works for all monads, but I have no proof. It at least works for everything I've tested so far (Identity, Maybe, Either, State, IO, Cont, List). We make the bind function take an extra argument args, which is a function that passes the free vars to the bind continuation. The new signature of the bind function is

(args: (arg1_t -> ... -> argn_t -> a -> Monad b) -> (a -> Monad b)) -> (val: Monad a) -> (nxt: (arg1_t -> ... -> argn_t -> a -> Monad b)) -> Monad b

Then, to convert nxt to have this new signature, we automatically lift free vars in the bind continuation in this manner:

a <- Val
nxt(a, free1,..., freen)

# Initially desugared to
(bind Val @a (nxt a free1 ... freen)
# Then converted to
(bind @nxt (nxt free1 ... freen) Val @free1 ... @freen @a (nxt a free1 ... freen))

This will make sure that the continuation always forms a combinator which can be lifted into a lazy reference, but requires that users correctly implement their binds with the extra argument. Here, args should not be duplicated in the bind function and it should be applied lazily to nxt, otherwise it will still loop. I can foresee that this will be hard for users to understand.

Two considerations that I also raise here are:

Which is the best order for the bind arguments? (args->val->nxt, val->nxt->args, others)
Can this possibly cause loops if val is recursive? I think not unless the first val is an active recursive call, but I haven't done enough tests. We could solve this by also applying another args-like value to val is it has free variables, but that would increase the complexity of bind one more level.

developedby commented 6 months ago

Here's a list of bind functions before and after this proposed change. Basically, whenever we apply something to nxt we swap that for applying to (args nxt), although that might not be exactly the case for all bind functions.

type Maybe = (Some val) | None
type Result = (Ok val) | (Err val)

Maybe/bind = @val @nxt match val {
  Maybe/Some: (nxt val.val)
  Maybe/None: None
}
Maybe/bind_ = @args @val @nxt match val {
  Maybe/Some: (args nxt val.val)
  Maybe/None: None
}

Result/bind = @val @nxt match val {
  Result/Ok: (nxt val.val)
  Result/Err: (Result/Err val.val)
}
Result/bind_ = @args @val @nxt match val {
  Result/Ok: (args nxt val.val)
  Result/Err: (Result/Err val.val)
}

List/concat (List/Cons x xs) ys = (List/Cons x (List/concat xs ys))
List/concat List/Nil ys = ys

List/flatten List/Nil = List/Nil
List/flatten (List/Cons x xs) = (List/concat x (List/flatten xs))

List/map (List/Cons x xs) f = (List/cons (f x) (List/map xs f))
List/map List/Nil f = List/Nil

List/bind = @val @nxt match val {
  List/Cons: (List/flatten (List/map val nxt))
  List/Nil: None
}
List/bind_ = @args @val @nxt match val {
  List/Cons: (List/flatten (List/map val (args nxt)))
  List/Nil: None
}

State/bind = @val @nxt @state let (state, res) = (val state); (nxt res state)
State/bind_ = @args @val @nxt @state let (state, res) = (val state); (args nxt res state)

Identity/bind = @val @nxt (nxt val)
Identity/bind_ = @args @val @nxt (args nxt val)

Continuation/bind = @val @nxt @cont (val (@a (nxt a cont)))
Continuation/bind_ = @args @val @nxt @cont (val (@a (args nxt a cont)))

IO/bind_ = @args @val @nxt match val {
  IO/Done: ((args nxt) val.expr)
  IO/Call: (IO/Call IO/MAGIC val.func val.argm @x (IO/bind_ args (val.cont x) nxt))
}

developedby commented 6 months ago

A suggestion by @tjjfvi on discord:

another approach that I believe would solve this:

I'm going to use defer value as shorthand for @unit match unit with * { Unit: value }, and undefer value as shorthand for (value unit), where data Unit = Unit // desugars to Unit = @x x; idk what new bend syntax is for this

a <- Val
nxt(a, free1,..., freen)

# desugars to
(bind Val (defer @a (nxt a free1 ... freen)))

# which itself desugars to
(bind Val @id (id @free1 ... @freen @a (nxt a free1 ... freen) free1 ... freen))
#                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#                               combinator!

# bind changes from
Maybe/bind = @val @nxt match val {
  Maybe/Some: (nxt val.val)
  Maybe/None: None
}

# to
Maybe/bind_ = @val @nxt match val {
  Maybe/Some: ((undefer nxt) val.val)
  Maybe/None: None
}

# which desugars to
Maybe/bind_ = @val @nxt match val {
  Maybe/Some: (nxt Unit val.val)
  Maybe/None: None
}

defer is the inet equivalent of a lazy thunk

developedby commented 6 months ago

A short example where this is necessary

type Result = (Ok val) | (Err val)

Result/bind = @val @nxt match val {
  Result/Ok: ((undefer nxt) val.val)
  Result/Err: (Result/Err val.val)
}
Result/foo x y = 
  with Result {
    ask a = (Result/Ok x)
    ask b = switch y { 0: (Result/Err a); _: (Result/Ok y-1) }
    (Result/foo a b)
  }

main = (Result/foo 1 2)

HigherOrderCO / Bend

Mechanism to make recursive monadic functions not loop #526