First-class functions and closures support

schungx commented 4 years ago

More specifics will be good here...

GopherJ commented 4 years ago

@schungx hi sorry yesterday I just came up with this and went to sleep so there isn't description~

So actually I'm trying to use closure for lazy function:

Something like:

// a simple function to add something
const add = (n) => (n + 10);
add(9);
// a simple memoized function to add something
const memoizedAdd = () => {
  let cache = {};
  return (n) => {
    if (n in cache) {
      console.log('Fetching from cache');
      return cache[n];
    }
    else {
      console.log('Calculating result');
      let result = n + 10;
      cache[n] = result;
      return result;
    }
  }
}
// returned function from memoizedAdd
const newAdd = memoizedAdd();
console.log(newAdd(9)); // calculated
console.log(newAdd(9)); // cached

Can we already do this in rhai? I believe closure is in the plan right?

schungx commented 4 years ago

Can we already do this in rhai? I believe closure is in the plan right?

Akem... not really... to have closures we really need a GC, otherwise it is a pain to figure out when we can drop the captured environment. Maybe an Rc... but this means first-class functions, which may not be too difficult...

I think the title probably should be "first-class functions and closures"

schungx commented 4 years ago

TL;DR

Thinking about this some more, this is actually not very difficult to add to Rhai. However, that does mean some changes to the design and syntax of the language.

1) We can enhance Dynamic to hold a reference-counted function trait object. This is easily done and all the mechanisms are already there. 2) We can call such a function trait object held by a Dynamic. 3) We can pass such values to functions, thus first-class functions. 4) We can extend the syntax of Rhai to support function literals (e.g. using Rust's |xxx| { ... } syntax).

However, first-class functions are really not very useful without closures, i.e. the ability to capture the running environment. This is where things become complicated. Rhai functions are deliberately pure, so they never mutate their environment. They cannot access the calling environment and must do so entirely via arguments passed by value. Functions are also allowed to be defined only at global level. This helps simplify the parser.

So, there'll need to be some architectural changes to Rhai in order to support closures:

1) Allow creating of function literals anywhere, thereby also allowing definition of functions anywhere (simply assign them to variables). Only global functions are exported. 2) Function calls can no longer be half-statically dispatched as it is now... because the scope must be searched first to check if a function name is shadowed by a variable. This will significantly affect function call speed. 3) Allow functions to access the surrounding scope. This is a major change because we'll have to maintain a scope chain during function calls to search for variables, with all the speed issues (you cannot pre-generate an index offset to the current scope and must always search by name) and problems coming with it (such as which variable shadows which). 4) Captured environment can be implemented simply as a reference-counted hash map containing all the captured variables. However, that means we'll have to know, in advance, what variables are going to be captured, and whether variables in outer capture groups can be re-captured. Alternatively, all variables can be heap-allocated to be sure, but that's gonna be very slow. 5) Don't forget... eval will throw a wrench into all of these. Now I know why JS engine writers curse the evil eval. So better remove eval as a feature.

schungx commented 4 years ago

Conclusion:

1) first-class functions are quite easy to add. The cost is function call speed (the need to always check if a variable exists with the function name).

2) closures are going to have major architectural implications and extremely costly.

Question: What major values do closures bring to Rhai?

jkelleyrtp commented 4 years ago

Conclusion:

first-class functions are quite easy to add. The cost is function call speed (the need to always check if a variable exists with the function name).

closures are going to have major architectural implications and extremely costly.

Question: What major values do closures bring to Rhai?

Part of Lua's capability as a scripting language are its first-class-functions. Entire game engines and simulation engines are built around Lua tables, which are really just maps with first-class functions.

Today, this code doesn't work in Rhai:

fn integrate(particle) {
   particle.x = x + (dx * dt);
}

let particle = #{
  x: 0,
  integrate: integrate
};

particle.integrate(particle);

It would be even better to be able to capture the caller so particle is automatically passed into child methods, but that's less important. Of course, the same functionality could be extended by Rust, but the recursion bonuses and simple OOP would be super handy, and bring Rhai closer to Lua functionality.

I'm not sure if it's important that closures capture their environment, at least off the bat. There's quite a bit of utility here without environment capture.

schungx commented 4 years ago

I'm not sure if it's important that closures capture their environment, at least off the bat. There's quite a bit of utility here without environment capture.

Well, closures that cannot capture their environment become essentially function pointers. Therefore we're looking at function pointers as first-class functions, but not supporting closures.

I'd assume, in an environment where first-class functions is needed, there is always some state that the user needs to refer to in these functions. The only way without closures is to always pass that state as a parameter.

What your example shows is the need to have some form of poor-man's OOP that attaches behavior to objects dynamically. If this is the intended usage, there is actually a simpler way to do it:

// Don't forget Rhai functions are PURE
fn integrate(particle, dx, dt) {
    particle.x = x + (dx * dt);
    particle
}

let particle = #{ x: 0, integrate:"integrate" };

particle = particle.integrate.call(particle, dx, dt);

assuming that we implement a call function to String that simply calls that function by name.

It would still not be ergonomic since particle is passed by value, so you have a lot of copying around.

schungx commented 4 years ago

In order to have reasonable OOP and avoid the copying, we need to extend functions to support what we'd call, say, "method" functions:

method integrate_method1(dx, dt) {
    this.x = x + (dx * dt);    // 'this' is passed by reference
}

let particle = #{ x: 0, integrate:"integrate_method1" };

particle.call_method(particle.integrate, dx, dt);

// or invent a new syntax that is more C-like:
particle->integrate(dx, dt);

Essentially, a String becomes a function pointer. It will not be too slow because I can immediately hash that string and arguments then lookup the method function by hash.

This is also not terribly difficult to do, and it looks like it might actually be useful...

But the question: Is this something worth burdening the language with?

jkelleyrtp commented 4 years ago

Poor man's OOP would be a great step up, though closures would be nice too. Capturing self/this would probably be the most useful thing, and while not super ergonomic to call.

I can see a use case with simple configurators / builders that can let you quickly build a schema or define some sort of complex "interaction object" or even a simple state machine.

... a state machine designer for rust code would be amazing actually, great for simulation.

schungx commented 4 years ago

I can see a use case with simple configurators / builders that can let you quickly build a schema or define some sort of complex "interaction object" or even a simple state machine.

Can you give an example? I can't really visualize this...

BTW, which one of the below would you pick?

// option 1: no new syntax
particle.call_method(particle.integrate, dx, dt);

// option 2: new syntax
particle->integrate(dx, dt);

The reason for the additional syntax is that we don't want to go the wrong way of JavaScript - i.e. the confusion as to what this or self binds to. Therefore, methods are restricted to object maps...

jkelleyrtp commented 4 years ago

Does particle.integrate() not work where integrate is fn integrate(particle, dx, dt) where the first argument is passed by reference (think Rust’s self)? Or particle.call(“integrate”, dx, dt) ? The arrow looks simpler, but the .call() feels more at-home in a rusty scripting language.

schungx commented 4 years ago

particle.integrate(dx, dt) works just fine, but then it makes sense to go "real first-class functions" and introduce a new interior type to Dynamic which is a function pointer (actually just the function's name, but separate from strings).

Then we can easily do:

fn foo(a, b) { this.x = a + b; }

let obj = #{ x: 0, method: foo };
obj.method(1, 2);    // this binds to obj

let func = foo;
func(1,2);               // what does this bind to?

Having it look too much like first-class functions will always open up the can of worms which is the binding of this.

The alternative is a really restrictive design where it only: 1) works on object maps, 2) calls via a function name stored in a property, 3) with a new syntax. This way there is zero confusion.

particle.call(“integrate”, dx, dt) is a valid alternate design syntax, meaning that the call method is only available on object maps.

jkelleyrtp commented 4 years ago

particle.integrate(dx, dt) works just fine, but then it makes sense to go "real first-class functions" and introduce a new interior type to Dynamic which is a function pointer (actually just the function's name, but separate from strings).

Then we can easily do:
fn foo(a, b) { this.x = a + b; }

let obj = #{ x: 0, method: foo };
obj.method(1, 2);    // this binds to obj

let func = foo;
func(1,2);               // what does this bind to?
Having it look too much like first-class functions will always open up the can of worms which is the binding of this.

The alternative is a really restrictive design where it only: 1) works on object maps, 2) calls via a function name stored in a property, 3) with a new syntax. This way there is zero confusion.

particle.call(“integrate”, dx, dt) is a valid alternate design syntax, meaning that the call method is only available on object maps.

I would imagine somewhere in the middle where 'this' is always passed is by reference as the first parameter in the function, that way the function can mutate in place.

fn foo(this, a, b) { this.x = a + b; }

let obj = #{ x: 0, method: foo };
obj.method(1, 2);  // obj automatically gets passed in as the first argument

let func = foo;
func(obj, 1,2);    // same as above

I would imagine it's more just syntax sugar than anything else. This is how rust and python both do it with classes.

jkelleyrtp commented 4 years ago

For the state machine:

fn action(this) {
   if this.led_enabled {
      this.turn_off();
   } else {
      thus.turn_on();
   }
}
fn turn_on(this) {
   this.led_enabled = true;
   this.action();
}
fn turn_off(this) {
   this.led_enabled = false;
   this.action();
}

let machine = #{
   led_enabled: true,
   action,
   turn_on,
   turn_off
};

machine.turn_on();

The transition functions could be more complex than shown above, but theoretically you could have a scripting language on a microcontroller that executes state machines with an API for the Rust interfaces. You wouldn't need to flash the microcontroller, and could program it remotely by sending a new script over wifi/bluetooth. The one issue with the example above is the program's max depth. If the functions don't return anything, there shouldn't be anything to have to clean up (essence of state machine, like functional programming, but inverted).

schungx commented 4 years ago

let func = foo;
func(obj, 1,2);    // same as above

I did some simulations and it isn't easy to get this working without sacrificing a lot of speed.

For example, what does foo refer to? Remember Rhai is late-bound. Functions can be added later. Let's say:

fn func(a, b) { ... }
fn foo(x, y) { ... }

let foo = 42;   // or even something whacky like eval("let foo = 42");

let func = foo;    // ???

func(1, 2);     // ???

In every resolution of a variable access we need to search whether it is a defined variable or the name of a function. In every function call we need to search whether it is the name of a variable.

Also we need to decide whether variables trump functions or the other way round.

schungx commented 4 years ago

I have the following design which I think might work:

method action() {
   if this.led_enabled {
      this.turn_off();
   } else {
      thus.turn_on();
   }
}
method turn_on() {
   this.led_enabled = true;
   this.action();
}
method turn_off() {
   this.led_enabled = false;
   this.action();
}

let machine = #{
   led_enabled: true,
   action: Fn("action"),
   turn_on: Fn("turn_on"),
   turn_off: Fn("turn_off")
};

machine.turn_on();

let func = Fn("turn_on");
func.call(machine);

Without closures, we cannot make just one function fn turn_on_off(state) and then make closures for the turn_on and turn_off properties. We have to make them separate functions.

schungx commented 4 years ago

Spent some time today to implement function pointers.

Syntax:

let func = Fn("foo");    // function pointer to function "foo"

func.call(1, 2, 3);    // call a function pointer

Pull from my fork to test.

Right now you need to do:

let x = #{ action: Fn("lights_on"); }

x.action.call(x);

Next step is to implement scripted methods and the syntax will become: x.action().

schungx commented 4 years ago

PR https://github.com/jonathandturner/rhai/pull/172 adds rudimentary OOP support.

Relax: Rhai still doesn't have objects. This is Poor Man's OOP.

There are three new features that, together, should work reasonably like an OOP:

Function pointers - let var = Fn(fn_name) creation syntax and var.call(...) calling syntax.
Special support for object map properties that hold function pointers - omit the call, essentially turning obj.method.call(obj, ...) into obj.method(...).
this parameter support in script-defined functions that binds to the object when called in method-call style - notice that this is a breaking change because, in previous versions, calling a script-defined function in method-call style simply pushes an additional parameter in front of the arguments list. Now the two calling styles are completely different.

schungx commented 4 years ago

For the state machine:

In fact, this state machine seems to recurse indefinitely and will cause a stack overflow unless stopped by the max function-call depth.

The one issue with the example above is the program's max depth.

You're right, having function pointers in properties calling each other can easily blow the stack out of proportion. I don't think there is any way around it though... I guess the remedy is: don't do it.

schungx commented 4 years ago

PR https://github.com/jonathandturner/rhai/pull/206 lands full closures support including capturing.

rhaiscript / rhai

First-class functions and closures support #159