bmx-ng / bcc

A next-generation bcc parser for BlitzMax
zlib License
33 stars 12 forks source link

Investigating Delegates and Closures #360

Open woollybah opened 5 years ago

woollybah commented 5 years ago

I'm currently exploring how we might add support for delegates and closures.

A delegate defines a function signature - a bit like a function pointer. A closure is a bit like an anonymous function.

Here's a basic delegate:

SuperStrict

Framework brl.standardio

Delegate MyDelegate(a:Int)

f2(f1, 5)

Function f1(a:Int)
   Print a
End Function

Function f2(d:MyDelegate, a:Int)
   d(a)
End Function

When run, it would output 5.

Here is an example that uses both a delegate and a closure :

SuperStrict
Framework brl.standardio

Delegate IntOperation:Int(i:Int)

Print "2 + 4 = " + CurriedAdd(2)(4)

Function CurriedAdd:IntOperation(a:Int)
    Return Closure(b)
        Return a + b
    End Closure
End Function

When run, it would output 2 + 4 = 6.

The closure would also have a shorthand form for single statements, so in the example above you could also write

Function CurriedAdd:IntOperation(a:Int)
    Return (b) >> Return a + b
End Function

As you can see above, we'd also be introducing some support for being able infer types.

Implementation

All of this functionality can actually be accomplished in BlitzMax already, but of course you would end up spending all your time writing a lot of boilerplate code to do it. The plan would be for bcc to generate all the necessary boilerplate and coerce the original code to use it.

Example 1 translated

A Delegate is basically a single method Interface, so our delegate would translate to

Interface _DMyDelegate
    Method callback(a:Int)
End Interface

where callback takes an Int and returns nothing.

The call to f2() would result in a mapping of the f1 function pointer to an instance of DMyDelegate.

Type _F1_DMyDelegate Implements _DMyDelegate

    Method callback(a:Int)
        f1(a)
    End Method

End Type

The actual call to f2() would change to

f2(New _F1DMyDelegate, 5)

where we'd now pass an instance of DMyDelegate into it.

The f2 function itself would change the function pointer like call to d into

    d.callback(a)

so you'd end up with something that looks like :

SuperStrict

Framework brl.standardio

f2(New _f1_DMyDelegate, 5)

Function f1(a:Int)
   Print a
End Function

Function f2(d:_DMyDelegate, a:Int)
    d.callback(a)
End Function

Interface _DMyDelegate
    Method callback(a:Int)
End Interface

Type _f1_DMyDelegate Implements _DMyDelegate

    Method callback(a:Int)
        f1(a)
    End Method

End Type

which, if you run it, outputs 5.

For the closure example, the delegate would translate to the following interface

Interface _IntOperation
    Method callback:Int(i:Int)
End Interface

The implementation for the closure also needs to store the a variable, so we end up with the following

Type _1_IntOperation Implements _IntOperation

    Field a:Int

    Method New(a:Int)
        Self.a = a
    End Method

    Method callback:Int(b:Int)
        Return a + b
    End Method

End Type

You can see here that we need to store the "environment" required to access any external variables that were referred to in the closure. In this case a is the external variable.

The closure itself is replaced by a call to the instance of the delegate which now contains it

Function CurriedAdd:_IntOperation(a:Int)
    Return New _1_IntOperation(a)
End Function

And finally, the function pointer-like call is modified accordingly

Print "2 + 4 = " + CurriedAdd(2).callback(4)

This results in the following code

SuperStrict
Framework brl.standardio

Print "2 + 4 = " + CurriedAdd(2).callback(4)

Function CurriedAdd:_IntOperation(a:Int)
    Return New _1_IntOperation(a)
End Function

Interface _IntOperation
    Method callback:Int(i:Int)
End Interface

Type _1_IntOperation Implements _IntOperation

    Field a:Int

    Method New(a:Int)
        Self.a = a
    End Method

    Method callback:Int(b:Int)
        Return a + b
    End Method

End Type

which, if you run it, produces the output 2 + 4 = 6.

HurryStarfish commented 5 years ago

Lambdas and Closures are exciting, and something I've been hoping to implement in NG someday. :slightly_smiling_face: I'll write down my thoughts about implementation details at another time (your approach to capturing looks quite C#-like, which was my plan as well); this is a summary of my ideas on the syntax:

Local add2:Int(Int)! = CurriedAdd(2) Print "2 + 4 = " + add2(4)

That way, they should be fairly easy to learn: a closure could be written and used just like a regular local function, except it can capture variables from the surrounding scope.
- Lambda syntax: This would be a shorthand syntax for defining closures (or, if nothing is captured, regular functions) whose body only consists of one expression. Since BlitzMax already has a general syntax for local functions (and with the above proposal, also a similar syntax for closures), the main purpose of lambdas would be *conciseness*, so their syntax should be minimalistic and short. I'd suggest introducing two new symbols `\` and `->`, like they are used in [Haskell](https://wiki.haskell.org/Lambda_abstraction). The backslash `\` is meant as a visual approximation of the letter lambda (λ), the right arrow `->` is commonly used for lambdas or function types in various other languages as well. Both happen to still be unused in BlitzMax, so that works out perfectly. :) With this, `CurriedAdd` could be written like so:

Function CurriedAdd:Int(Int)!(a:Int) Return \b -> a + b End Function


(In theory, we could omit the `\`. Some languages, including Java and C#, do not have any such prefix for their lambdas; their syntax simply consists of the parameter list, the arrow, and the expression body. I am in favor of having the prefix though, because not only does it make the syntax easier to parse for the compiler, it's also a bit clearer for the programmer. When you're reading from left to right, you'll first reach the `\`, which makes it immediately clear you're looking at a lambda.)
- Local type inference: Kind of a prerequisite to make all of the above work smoothly. Just like in your initial post, I didn't specify the type of the parameter `b` in the above example. The return type of lambdas should also be inferred, based on the expression that makes up their body. This stuff can get very complicated, especially when generics are thrown into the mix, but a basic form of it might be not too hard to get working. Also, iirc type inference for `Local` declarations (e.g. `Local i := 5` or `Local add2 := CurriedAdd(2)`) is pretty much already implemented in BlitzMax-NG bcc and would just need to be "enabled"; I guess this would be the time to do that.

- Other operators: I've been brainstorming/prototyping a little library of simple utility functions that could be created to make functional-style programming with closures more convenient. Things like function composition, negation of boolean-returning functions, currying, switching the order of parameters and partial application might be useful. At a later point, it might be worth considering adding new operators to make these operations more convenient to use. For example, [`>>`/`|>` for function composition/piping like in F#](https://blogs.msdn.microsoft.com/chrsmith/2008/06/14/function-composition/) might be nice. :slightly_smiling_face: 
HurryStarfish commented 4 years ago

I've implemented a first version of lambda expressions on my fork of bcc. Currently, they're just a shorthand syntax for regular functions (no closures yet) and they require their argument types to be specified, whereas the return type is inferred from the body.

So for example, this

Function DblFunc:Int(i:Int)
    Return 2 * i
End Function

Local dbl:Int(i:Int) = DblFunc
Local x:Int = dbl(5)

can be written more compactly as

Local dbl:Int(i:Int) = \i:Int -> 2 * i
Local x:Int = dbl(5)

or as

Local x:Int = (\i:Int -> 2 * i)(5)

To make this more useful, the next step would be "target typing": making the lambda able to infer its argument and/or the return types from the declaration of whatever it is being assigned to. For example, all of these should work:

Local intToStr:String(i:Int) = \i:Int -> i   ' currently: Compile Error: Unable to convert from Int(Int) to String(Int).
Local dbl:Int(i:Int) = \i -> 2 * i   ' currently: Compile Error: Missing type specifier.
Local strlen:String(s:String) = \s -> s.length   ' both problems combined

In practice, this would especially be useful for passing lambdas as arguments to other functions. Other possible applications for target typing, aside from lambda, might be array literals (inferring the element type) and generic types/functions/methods (inferring type parameters).

However, I'm not yet sure about the best approach to implementing this. Expressions that can be target typed must somehow need to know about their target during semanting. So the target would probably need to be recursively passed "inward" to expressions during either the semanting, or already the parsing phase. Additionally, the target must be semanted first: for example, to figure out the type of the lambda in f = \i -> i * 2, it would be useless to have f in the form of a TIdentExpr. This seems complicated to ensure in the current bcc, given how every statement and expression is individually responsible for somehow semanting its children and doesn't normally know about its parent. Any ideas?