swarm-game / swarm

Resource gathering + programming game
Other
840 stars 52 forks source link

Better support for mutual recursion #605

Open byorgey opened 2 years ago

byorgey commented 2 years ago

Is your feature request related to a problem? Please describe. Today I wrote this code:

def DFS =
  ifM (ishere "goal") {grab; return ()} {};
  ifM (ishere "rock") {} {
    place "rock";
    tL; b <- blocked; if b {} {move; DFS};
    tR; b <- blocked; if b {} {move; DFS};
    tR; b <- blocked; if b {} {move; DFS};
    tL
  };
  tB; move; tB
end;

and I wanted to factor out the b <- blocked; if b {} {move; DFS} part into its own function, but that would require defining two mutually recursive functions. At the moment that is not directly possible, since function definitions happen in sequence.

Describe the solution you'd like It would be nice if there was some way (perhaps something like Agda's mutual blocks?) to define several things at once.

Describe alternatives you've considered We can already encode mutual recursion using pairs. For example:

def evenodd = (\n. if (n == 0) {true} {snd evenodd (n-1)},   \n. not (fst evenodd n)) end;
def even = fst evenodd end;
def odd = snd evenodd end

We can also encode mutual recursion by having the first function take the second as an argument. For example:

def even' = \odd. \n. if (n == 0) {true} {odd (n-1)} end;
def odd = \n. not (even' odd n) end;
def even = even' odd end

I am not sure how this compares to a built-in solution, efficiency-wise. It's also kind of awkward. But maybe it's better to just provide fundamental things and let people encode more complex features.

xsebek commented 2 years ago

Is there something stopping us from collecting the definitions that occur after each other and considering them at the same time?

At least for the purposes of name resolution that is. But maybe that would not work with that recursion to forcing delayed computation rewrite. :thinking:


The worst thing we could do is to automatically rewrite this:

def A = \a. ... end
def B = \b. ... end
...

to this

def A = let defs = (\a. ..., \b. ..., ...) in fst defs end
def B = let defs = (\a. ..., \b. ..., ...) in fst $ snd defs end
...

But that sounds like a textbook nightmare example.

valyagolev commented 2 years ago

is it maybe possible to just pass the function (DFS) as an argument to the other one?

byorgey commented 2 years ago

@valyagolev oh, good idea! Yes, I think that's another way to encode mutual recursion. So something like this:

f' g' = ... g' ...
g = ... (f' g) ...
f = f' g

Instead of calling g directly, f' takes g' as an argument and calls that. g calls f' and passes itself as an argument. Finally we can define f as f' with g passed in.

valyagolev commented 2 years ago

oh, I meant this for the solution of the initial problem from the issue. otherwise (if takes as a way to implement recursion) my suggestion is almost identical to the immediately preceding one :)

tbh, while playing the game there was one particular moment when I thought I'd love if the language would resolve the names late

basically if i have have a "lib.sw" file with a function called "go", and i type run "lib"; go, the first time it won't even run, because go is not defined. but if i run the script once, and then try to use run "lib"; go to quickly iterate (up, enter), it will run... the previous version each time! which is a bit annoying in the repl

resolving names late would fix this problem and give mutual recursion for free

valyagolev commented 2 years ago

late resolution of names is very good for repl, IMO. one objection here would be that type checking becomes impossible. but this could be solved by requiring that a redefined name must have the same type. which is also annoying in the repl

might here be some kind of a trick?..

valyagolev commented 2 years ago

we could have a new kind of value, like FutureBind Name UType (Maybe Term) (might be very wrong about the type, not very familiar with the codebase), which basically says: this might be rebound later with this name; it must adhere to this type; if it's not rebound or rebound with a wrong type, use this (previous) term

xsebek commented 2 years ago

@valyagolev thanks for the mutual recursion tip, it looks a bit easier to write than pairs of functions. :+1: :smile:

Re run, that is a different issue (#495) and we would like to improve that by loading the file before type checking so that your go is resolved properly. :slightly_smiling_face:

Regarding late resolution, does any other language with a REPL do that? The only one I can think about is Bash, but that is untyped and in many ways different from Swarm. I vaguely remember some REPL turning multiline (which we do not have yet #134) until you have a correct expression. That way you can at least go back and edit your previous definition.

valyagolev commented 2 years ago

Re run, that is a different issue (#495) and we would like to improve that by loading the file before type checking so that your go is resolved properly. 🙂

oh, yes, this is better, of course. read-time is good

Regarding late resolution, does any other language with a REPL do that? The only one I can think about is Bash, but that is untyped and in many ways different from Swarm

can't really imagine a typed language here, but in Python any global name used by a function can be reassigned and it will use a new name. it basically does a dict lookup every time

at the end of the day, mutually recursive definitions in an incrementally-executed language (as opposed to one which allows you to read the whole source) seem to me to be impossible to process without a mechanism to treat an unknown name as a promise that something appropriately typed will be under that name at some point

if we let such a promise survive beyond the type-checking phase, it becomes late resolution

valyagolev commented 2 years ago

such a promised name could be also expressed as part of the (much more broadly understood) type. it’s basically a negative part of the environment type. i see two ways here:

  1. as a part of the term’s type itself, basically like “ E { name1 :: Type1, name2 :: Type2 } => SomeType( Type1, Type2 ) “. not very nice because it would mean that even after definition of those names this dependency will dangle, or you have to update the types of previously-defined terms
  2. as a typed restriction of the environment. something like: the moment you define a function that uses a name that doesn’t exist, you “owe” that one. you can’t execute any code, but can only define names, until you fulfill that promise (or give up somehow)
valyagolev commented 2 years ago

sorry for bombarding you with the comments, but here’s another idea:

completely separate definition of functions, and their execution

one-line repl is not a very good place to write and edit functions, IMO

so we could have a list of functions and their types somewhere in the interface, each editable in a nicer editor, and you can easily add new ones

and then its trivial to promise one of them to be mutually-recursively defined later.

this has another benefit of integration with the resource system more easily, as you could immediately see which resources a function needs

byorgey commented 2 years ago

Hi @valyagolev, thanks for the ideas! I had not thought much about early vs late binding before so this was really interesting and enlightening to think about. However, I'm not convinced it's a good idea. The biggest issue for me is the type system. We are committed to having a strong, static type system, and I would not want late binding to result in possible crashes at runtime. But, as you discussed, figuring out how to have a sensible type system that interacts with this feature is complex at best (and perhaps impossible). It would be very interesting to think about, to be sure, but we only have so much complexity budget as far as the type system is concerned, and I am much more interested in spending that complexity budget elsewhere.

at the end of the day, mutually recursive definitions in an incrementally-executed language (as opposed to one which allows you to read the whole source) seem to me to be impossible to process without a mechanism to treat an unknown name as a promise that something appropriately typed will be under that name at some point

That's not really true --- as I mentioned in the original issue, you just need a way to explicitly define multiple mutually recursive functions at once. Agda and Coq both have this (they have to be incrementally processed for other reasons).

one-line repl is not a very good place to write and edit functions, IMO

I completely agree! That's why we've added LSP support, so the idea is that you will spend most of your time writing functions in your favorite IDE, not at the REPL. Maybe we need to emphasize this more, e.g. in the tutorial (it is mentioned in the tutorial but only once towards the very end).

valyagolev commented 2 years ago

what you say makes perfect sense to me, thank you! i do use the editor like that, so i'll just be looking forward for the new import read-level primitive

byorgey commented 10 months ago

One possible path toward a solution would be:

  1. Create syntax for let with multiple bindings, at first just as syntax sugar for nested let
  2. Add a variant letrec which brings all the bound identifiers into scope in the bodies. So e.g. you could write
    letrec even : int -> bool = \n. if (n == 0) {true} {odd (n-1)}, odd : int -> bool = \n. not (even n) in ...
  3. Make def just syntax sugar for let/letrec somehow, as discussed in #1087 .
    • Note, as of now this has been done, def is just syntax sugar for let.