johnynek / bosatsu

A python-ish pure and total functional programming language
Apache License 2.0
224 stars 11 forks source link

removing automatic currying #1026

Closed johnynek closed 1 year ago

johnynek commented 1 year ago

The current representation of functions is that there is only a -> b. We encode def foo(a: a, b: b) -> c: as a -> b -> c. Like Haskell.

This creates two challenges:

  1. it deoptimizes functions which naturally often have multiple arguments. In those cases we wind up building closures over the first N-1 args only to do the real work at the very end.
  2. it creates a situation where we can only type check and see an error when we apply too many arguments to a function. When we don't apply enough it still type checks at the site in many cases, but then gives a confusing error later.

So, instead of copying Haskell, we would be more like scala 2 and have fixed sized Functions, say up to size 8 or something. (Haskell 98 requires tuples up to size 15, and standard functions up to size 7). It sometimes said humans can remember 7 plus or minus 2 things at one time. Choosing 8 is more than 7, and a power of two.

So, then we would require applying all the arguments of a function at once, or we would fail the type check because we would try unifying with the wrong arity of function.

There are three syntactic forms of function application:

  1. normal application foo(a, b)
  2. dot-syntax: a.foo(b)
  3. left apply:
    x <- list.flat_map
    [x, x]
    #desugared into list.flat_map(x -> [x, x])

Normal application wouldn't change, except that it wouldn't be auto-currying. You would need to manually curry if you want that with e.g. x -> foo(y, x)

But for the dot apply, we need to change things. Plausibly, a.foo(b) could desugar to foo(a)(b) or foo(a, b). Since the latter is more in-keeping with the ideas here, we have to deal with the fact that a.foo can't be a valid value, since (a.foo)(b) would then be different from (a.foo(b)) if we use the second desugaring. Therefore, we need to ban a.foo and require a.foo()(b) for the case which is the same as foo(a)(b). Again, very arguably, this is more in keeping with the bosatsu way: if we have a function application it always has a ().

For the final case, left apply, I can see two possibilities: always expand into a function1, or allow some way of encoding the args. Since we already have (x, y) <- ... meaning pattern matching on expecting 1 arg which is a tuple, and that seems right, I wouldn't want to change that. One possibility would be to have x, y <- ..., without parens, to mean a 2 argument function. This however is a little weird since in python it is the , that creates the tuple. Also the lambda syntax already uses (x, y) -> to mean a function2. Maybe the same rule should apply here and if you want to pattern match on a tuple, you use an extra layer of parens: ((x, y)) <- .... That would be consistent with the lambda syntax, which the <- is supposed to invoke anyway.

I'm inclined to make this change as I've been thinking more and more about actually generating code that performs well and having explicit function arities, while possibly unmotivated purely by theory of lambda calculus, is very well motivated by the goal of running this code on existing CPUs or language VMs.

johnynek commented 1 year ago

The other thing that would need to change is that the type (a, b) -> c should parse as function2, not function1 of the tuple (a, b). Again, I guess following lambda syntax, to write function1 of a tuple you use extra nesting: ((a, b)) -> c or ((a, b, c)) -> d for instance.

johnynek commented 1 year ago

Making currying explicit suggests a scala-like syntax when we want to make defs with multiple params:

def foo(x)(y):
  x.add(y)

# foo: Int -> Int -> Int

This allows a bit of a nicer syntax for the cases where you do want currying. A common performance reason for currying is when you can match on the first argument one time and then return different functions for different matches. This means that you don't re-run the match in each call. For instance:

def foo(x)(y):
  match x:
    case 0: y
    case notZero: notZero.add(y)

# the optimizer can rewrite this to:
def foo(x):
  match x:
    case 0: y -> y
    case notZero: y -> notZero.add(y)

The optimizer already moves matches closed-over values outside of the lambda to take advantage of that. Having the syntax above would be a nice syntax that doesn't block the optimization.

johnynek commented 1 year ago

There is a connection to struct/enum size here, which I overlooked: since constructors are just normal functions, they have to have types, so that means if you limit to functions of N args, you have to limit to structs of N args.

If you want to be able to implement something like fast immutable vectors, then you will want a 32 factor of branching likely. This suggests a limit of 8 is too small. But writing out 32 function types in the predef.bosatsu, seems annoying, so maybe go back to programmatically adding all the Fn types.

johnynek commented 1 year ago

Some defs we do want to curry, like computing one ordering from another. It would be nice to have a syntax like:

def eq_Opt(fn)(a, b):
  match (a, b):
    (Some(a), Some(b)): fn(a, b)
    (None, None): True
    _: False

So then you can do things like eq_Opt_Int = eq_Opt(eq_Int)