Simn / genjvm

13 stars 1 forks source link

Closures #2

Closed Simn closed 5 years ago

Simn commented 5 years ago

Both field and local closures don't work at the moment.

Simn commented 5 years ago

We should distinguish local functions from real closures that capture the environment.

Local functions

These are currently generated as static functions in the containing class scope. For instance i -> i * 2 becomes this:

  public static int hx_closure$0(int);
    descriptor: (I)I
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=2, locals=1, args_size=1
         0: iload_0
         1: iconst_2
         2: imul
         3: ireturn
      LineNumberTable:
        line 177: 0
      StackMapTable: number_of_entries = 0

I think this is going to be good enough for quite a while. Eventually, we should look into invokedynamic because that's how Java handles its own lambda functions. That instruction is not the most straightforward one though.

Actual closures

I think these have to generate classes to capture the environment. We have to check how exactly to handle this at the call-sites without causing overhead for all calls.

Simn commented 5 years ago

I wonder if Java lets us simply extend MethodHandle and override invoke. This would allow us to generate this:

class Main {
    static public function main() {
        var capturedString = "foo";
        function myClosure(arg:Int) {
            trace(arg, capturedString);
        }
        myClosure(12);
    }
}

Like this:

class MyClosure extends MethodHandle {
    var capturedString:NativeArray<String>;

    public function new(capturedString:NativeArray<String>) {
        this.capturedString = capturedString;
    }

    override function invoke(...args) {
        var arg = args[0];
        trace(arg, capturedString[0]);
    }
}
Simn commented 5 years ago

The answer to that is no: "Method handles cannot be subclassed by the user."

That's unfortunate.

Simn commented 5 years ago

This should work:

class Main {
    static public function main() {
        var capturedString = "foo";
        var myClosure:MethodHandle = hx_closure$0.bindTo(new MyClosureContext(capturedString));
        myClosure.invoke(12);
    }

    function hx_closure$0(context:MyClosureConect, arg:Int) {
        trace(arg, context.capturedString);
    }   
}

class MyClosureContext {
    var capturedString:String;

    public function new(capturedString:String) {
        this.capturedString = capturedString;
    }
}

I'm not sure yet what the best way to obtain a MethodHandle is, but that's a separate problem.

nadako commented 5 years ago

IIRC, C# goes even further and allocates the context first, and uses its fields for storing local vars instead of passing locals into the context when binding:

var $ctx = new MyClosureContext();
$ctx.capturedString = "foo";
var myClosure:MethodHandle = hx_closure$0.bindTo($ctx);

However this surfaces a semantic difference between C# and Haxe. In C# if you do:

var f = new List<Action>();
for (var i = 0; i < 10; i++) {
  f.Add(() => Console.WriteLine(i));
}
foreach (var x in f) {
  x();
}

it will print 10 ten times, because the closure will use the same i field that will be changed by the loop.

In Haxe, however this will capture the current value of i:

class Test {
  static function main() {
    var f = [];
    for (i in 0...10) {
      f.push(() -> trace(i));
    }
    for (f in f) f();
  }
}

Just something to keep in mind.

Simn commented 5 years ago

Algorithm:

(1) Actually, we need to know the name of the context class at this point already. Have to make sure the name is determined consistently.

(2) We have to be careful: We want to reference the captured locals, but they may be captured in our current context already in the case of nested closures. This means we have to do the full TLocal v typing instead of just grabbing the v_id.

Simn commented 5 years ago

@nadako: That looks broken to me... Not sure if design or oversight.

nadako commented 5 years ago

Not sure if design or oversight.

Exactly. But the good thing with this context preallocation is that you don't have to wrap every mutated variable in its own single-element-array.

Simn commented 5 years ago

My algorithm doesn't quite work: We need to know ahead of time if there's a context or not because if there is we have to add it as the first argument. That means we have to scan ahead in the tf-expr to find which vars it captures. This is always a bit annoying to handle because we have to respect all declarations within the closure as well.

Simn commented 5 years ago

This works now, but is going to require some cleanup.

I'm not sure if I'm using local classes correctly. I seem to set the correct attributes, but the decompiler just shows them as normal toplevel classes.

One optimization we should do is to not create the context class if we have only one captured variable. In that case we can just pass in the variable directly.

Simn commented 5 years ago

I wanted to post a before/after diff, but for some reason the decompiler says that the before-version cannot be decompiled...

class Main {
    static public function main() {
        var x = 12;
        function f() {
            Sys.println(x);
        }
        f();

        var y = 12;
        function g() {
            y = 13;
        }
        g();
    }
}
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//

import haxe.jvm.annotation.ClassReflectionInformation;
import java.lang.invoke.MethodHandle;

@ClassReflectionInformation(
    hasSuperClass = false
)
public class Main {
    public Main() {
    }

    public static void hx_closure$1(int[] y) {
        y[0] = 13;
    }

    public static void hx_closure$0(Integer x) {
        System.out.println(x);
    }

    public static void main(String[] args) {
        int x = 12;
        MethodHandle f = (MethodHandle)"hx_closure$0".bindTo(Integer.valueOf(x));
        f.invoke();
        int[] y = new int[]{12};
        MethodHandle g = (MethodHandle)"hx_closure$1".bindTo(y);
        g.invoke();
    }
}

We should be good here!