jckarter / clay

The Clay programming language
http://claylabs.com/clay
Other
403 stars 34 forks source link

Explicit lamba capture clauses, stateless lambdas #135

Open jckarter opened 12 years ago

jckarter commented 12 years ago

It could be useful to allow environment captures to be made explicit in lambda syntax, if we can come up with a syntax that isn't too C++0x-ish. Having an arrow form to assert a stateless lambda would also be useful in contexts where a stateless function must be used.

ghost commented 12 years ago

Possible extension to lambda syntax:

var y, z = 6, 7;
x => x + ref y + z  // y captured by ref, z is copied free variable
x -> x + ref y + z  // ref y is a no-op, all captured by ref

-> guarantees stateless. I think this would cover all cases. Any thoughts?

Note that I have assumed here that capturing by ref and value by the same lambda is in some way useful, it occurs that this may not be the case and the current lambda syntax is sufficient. In which case I guess this issue is no longer an issue.

jckarter commented 12 years ago

I think that would clash with the return ref syntax. I don't see how -> guarantees stateless—your example still captures y and z. I'm not sure how generally useful mixed capture is, but stateless lambdas have the ability over other lambdas to be turned into stateless CodePointers and CCodePointers, which is sometimes important when interfacing with C functions, and explicit capture is useful for verification or documentation purposes, if there's a local that really shouldn't be captured or something like that.

ghost commented 12 years ago

I guess I was falsely under the impression that 'stateless' included capturing by reference, just no copies. My mistake.

jckarter commented 12 years ago

Technically you're right (the best kind of right), but by "stateless" I really meant "representable as a single function pointer".

ghost commented 12 years ago

I suppose the arrow syntax could be extended with ~> for stateless. Not a very clear distinction from the current syntax though.

jckarter commented 12 years ago

Yeah, that's maybe the least ugly choice. Unfortunately I think explicit capture clauses are just inherently ugly; if we free up the colon by removing the trailing block sugar, one possibility would be something like args ~>:(a, ref b) body.

Another interesting case to support would be capture-by-move, which is necessary to support capture of move-only objects like UniquePointer, and is also a good optimization for things like the monad pattern, where the lambda is the last consumer of the scope and so can move it into itself.

ghost commented 12 years ago

That's actually not too bad. If the frequency of use is quite low then a little ugliness isn't a big deal as long as the syntax is reasonably consistent/concise. Whilst on the subject of arrow syntax it may be better to restrict arrow usage just to lambdas (custom operators not included). For instance the --> named returns arrow seems superfluous:

foo(x) : Int32, Int64 {...}
foo(x) : Int32 = ... ;
foo(x) returned:Int32 {...}

it's all pretty obvious what's going on without the arrow.

Also, the corresponding initialization arrow could be dropped:

foo(x) ret:Int {

    var a = 3;
    a = 2; // assignment
    initialize(a, 6); // function call required in the rare case of re-initialization or if you want to be explicit
    ret = 7;   // initialize and assign or just assign if already initialized

    // or maybe use ':=' for explicit initialization operator
    ret := 7

}
jckarter commented 12 years ago

Well, initialization and named returns are unsafe and aren't meant to be used normally, which is part of the reason for the ugly three-character operators. They can't quite be replaced with a primitive, since multiple-value initialization (..a <-- ..b) needs to work. I agree though that the arrow notation is confusing. They could perhaps be replaced with an ugly-looking keyword like __init__.

ghost commented 12 years ago

Interesting, I thought named returns were the way to go when possible to do so.

Maybe initialize named returns by default (go-esque) and use an ugly keyword to circumvent this behavior.

jckarter commented 12 years ago

There isn't much benefit to pre-initialized named returns. Uninitialized named returns are necessary to implement some low-level value semantics for builtin types, and are otherwise only there so you can manually perform NRVO if you don't trust the compiler. If the compiler performed automatic NRVO like C++ compilers do, and primitives were provided for low-level elementwise initialization of tuples and records, then they wouldn't be necessary at all.

ghost commented 12 years ago

OK, I see. In that case the ugly keyword instead of arrows makes sense and the relevant compiler/primitive improvements should be raised as an issue (iirc NRVO already has a ticket) and named returns eventually removed.

jckarter commented 12 years ago

Yeah, eliminating named returns would be nice. A safer approach to guaranteeing NRVO would be to allow vars to be annotated saying "I really want this variable to be allocated into the return value", and having the compiler raise an error if it can't do that safely.

ghost commented 11 years ago

Do you mean explicitly annotated by the programmer?

Something like a return-binding:

foo(z) {
    return a = z * 3; 
    var b = bar(z);
    a += b; // a is automatically returned as already declared as return value
    return b; // this is a compile error
}

Looks kind of weird though . . .

jckarter commented 11 years ago

I'm not sure exactly what I mean. Something like that might work. Another possibility would be to allow named returns to be declared, but have the values be bound and initialized by vars within the return value instead of having them be bound implicitly and leaving initialization up to the user:

foo(z) a:Int {
    var a = z * 3;
    ...
}
galchinsky commented 11 years ago

What about operator overloading? Compiler gives a list of detected capturings to an operator and really puts in lambda what operator returned.

(->)(forward ..x) = ..x;
(#>)(..x) = move(..x);
(=>)(..x) = ..x;
(~>)() = return Tuple[];
(~>)(..x) = staticAssert(false);
(++>(..x, RefNamesList : Vector, CopyNamesList: Vector))=//static for in ..x and C++ style explicit capturing

Party hard:

(O_o>)(..x) = create_and_return_shared_ref //It's like shared_ptr, but ref
(T_T>)(..x) = gc_controlled_ref
jckarter commented 11 years ago

The lambda arrow isn't quite an operator because the left-hand side is parsed as an argument list rather than an expression. However, capture behavior could be handled by a hook function. In newclay, I had lambdas work by desugaring into an expression [L] captureLambda(#L, ..freeVars), where L was a symbol representing the capture type. By allowing L to be extensible, you could support custom capture implementations; as you noted in #430, having to manually cast a lambda to Function is awkward, and providing a compact syntax for Function literals would make higher-rank functional programming easier. I'm not sure how the custom symbol would look; maybe you could reserve operator symbols ending in > as lambda operators or something cheesy like that.