ponylang / ponyc

Pony is an open-source, actor-model, capabilities-secure, high performance programming language
http://www.ponylang.io
BSD 2-Clause "Simplified" License
5.71k stars 415 forks source link

Lambda Types #382

Closed jtfmumm closed 8 years ago

jtfmumm commented 9 years ago

Since lambdas are anonymous, they generate types like "$0$4 val" in compiler errors. I think it would be nice if there were a built-in interface that lambdas implement, something like:

interface box Fn[A,B]
  fun apply(a: A): B

So far I've been adding that interface to all my projects in order to write higher-order functions that take functions as arguments. That's not too much of a problem, but it would be nice to see what the compiler believes the signature of a lambda to be when reporting errors.

andymcn commented 9 years ago

The problem with an interface like that is it has to know how many arguments the lambda function takes, so a single builtin type is tricky.

Error messages referring to anonymous types definitely need improving, and this is already on our to do list.

jtfmumm commented 9 years ago

Ok.. yeah, I've been using Scala-style Fn1, Fn2, Fn3, etc. Not necessarily ideal

ryanai3 commented 9 years ago

Yous should be able to pass in the set of args in a tuple.

ryanai3 commented 9 years ago

When I've run into issues, I use

let x: Fn[(U64, U64, String), String] = ...

instead of a Fn3

jtfmumm commented 9 years ago

In this case though, the lambda becomes messy. Do you have to destructure the tuple in the body of the lambda first (to avoid args._1, etc.)? And having to name the tuple in the lambda as a parameter would be awkward.

jtfmumm commented 9 years ago

Though maybe I'm not thinking of a cleaner way.

ryanai3 commented 9 years ago

Imo, the cleanest option would be a builtin syntax for specifying lambda types. I've been making do with a "Tupled" primitive that turns a function with arguments into a function that accepts a tuple matching the argument types in order

ryanai3 commented 9 years ago

some scalaesque: let fn: ref(U64, String, Wombat val) => F32 would be nice

jemc commented 9 years ago

Theoretically Pony could support a syntax for destructuring tuple parameter elements to local names in the function. Something like:

fun tag foo((x: U32, y: U32)): U32 => x + y
fun tag test_foo() => foo((1, 2))
jtfmumm commented 9 years ago

This is true, though in the case of lambdas it seems as though the tuple could be implied. Perhaps the ideas are related.

ryanai3 commented 9 years ago

That would be really convenient, and would solve most of these issues.

ryanai3 commented 9 years ago

This might have issues with partial application though :\

andymcn commented 8 years ago

The current plan is to keep the language exactly the same, but in error messages use type descriptions something like (exact format TDB):

lambda(x: U64, y: Bool) => U64

We did originally have a syntax for specifying function types inline. However, we found it made the code much less clear and harder to read, especially when the types involved were not trivial. Now we recommend using a named interface:

interface MyLambdaType fun apply(x: U64, y: Bool) => U64

Automatically destructuring tuple parameters would cause us quite a lot of problems, with parsing and correctness, so this is unlikely to happen.

Other than the error messages, which admittedly are currently awful, I don't quite see what you're trying to achieve here. What problem is all of this lambda interface and tuple parameter stuff trying to solve?

jtfmumm commented 8 years ago

When writing functional code, it's common to pass around anonymous functions. So if it's not built in, I'll always be adding interfaces Fn1, Fn2 etc to a new project and importing them everywhere. It's not that bad, just inconvenient.

jtfmumm commented 8 years ago

The error messages you describe solve the real problem.

jtfmumm commented 8 years ago

Of course, what I'd really like to do is write signatures like

fun map[A,B](lst: List[A], f: A => B): List[B]
pyrossh commented 8 years ago

Creating anonymous functions does make it easier for programming but in strongly typed languages even in go we usually assign a type name or interface to that function so that we could understand what it does easily. I think the interface approach is good and a lot like java and gives us a specific structure when coding.

jtfmumm commented 8 years ago

I haven't found lightweight function types like A => B a barrier to understanding in strongly typed languages like OCaml, Scala, or Haskell. And at least personally, I've found dealing explicitly with functional interfaces in Java cumbersome. Naming is important in either case to ensure clarity of intent. I assumed the reason simple function types don't exist in Pony was because of technical challenges.

ryanai3 commented 8 years ago

even go does have a syntax for inline function types though

 func(int, int) string

The same naming & structure argument applies to anonymous functions - why have them when we can just write up a primitive with a name, and it's only a couple more lines of code. The "primitive" approach gives us a specific structure, it's easy to understand what it does, and yet at the same time, it's much less convenient. In terms of readability, if you're going through a bunch of code that expects functions and passes around functions and returns functions, often it's easier to read that the function expects: ref(Wombat val, U64) => Bool ("A function that takes a wombat, number, and returns a bool, sweet!") vs. a function that expects: IsWombatHungryFn ref , and then you have to go searching through the code to see where it is defined and what it means.

I can see the argument for not allowing full interface types inline, given that just after 2 or 3 functions on an interface, it gets really long, but when it's just a one- liner simple expression like ref (A val, B iso) => Wombat tag or ref(A val -> B iso -> Wombat tag) (or whatever syntax you come up with), it's something that's definitely a lot more convenient. I'm a big fan of deferring to the programmer in this case- if they pass around a lot of random small functions, it's definitely easier to define them inline. If they get to the point where everything expects or returns or messes with a ref (Wombat val, U64) => Bool , then that's the point where i'd expect them to stick it in an interface.

thoughts?

ryanai3 commented 8 years ago

the one thing that makes writing function types possibly a bit more clunky in pony (from a syntax perspective) is the existence of reference capabilities (especially on the function itself!). there are a number of nice syntaxes that look fine with it though (imo):

scalaesque fn: val (Wombat Ref, U64, String) => Bool Haskellesque fn: val(Wombat Ref -> U64 -> String -> Bool) , or even fn: (Wombat Ref -> U64 -> String -> Bool) val

andymcn commented 8 years ago

This is interesting. Our thinking for how Pony currently does things was based on the following assumptions:

First that if you're using a type as long as fn: val (Wombat Ref, U64, String) => Bool more than twice you're going to create an alias for it anyway.

Second that all programmers these days use modern IDEs or editors in which finding a type definition requires a single keypress (or key combo). This is particularly easy to set up for Pony because it's very easy to parse.

Given those assumptions defining function types inline seems to be a source of clutter rather than clarity. I presume you disagree.

ryanai3 commented 8 years ago

I agree with the assumptions, but not with the conclusions. In my mind, those points apply equally as well to anonymous functions/lambdas - if you're using a function in your code, you're probably going to use it twice anyway, so why not force the user to create a primitive. Ditto with inline objects.

Given the modern state of tooling (IDE's, editors), that function is just a single key combo away. In theory, inline lambdas are a source of clutter - it might be more clear to write the lambda or anonymous object separately and refer to it.

I think the problem is slightly obscured by the current state of the pony compiler - lambdas are the only types (other than literals in some cases) where you don't need to declare their type - it's inferred.

let u = lambda (x: U64) => x + 3 end

Imagine if we had to do:

let u: AdderFunction = lambda (x: U64) => x + 3 end
// many lines later
interface val AdderFunction
  fun val apply(x: U64): U64

everytime we defined a lambda.

So instead of having to create an interface everytime we want a lambda, the compiler helps us out and infers the type. Because we don't have a syntax for inline function types we have to do this every single time we expect a function, store a function, or return a function. (For the example above, you're fine with a lambda, but the moment you want to return it, you have to write up a whole interface to declare the return value, even though all you're doing is declaring (U64 -> U64).

When I've been doing some functional stuff in pony, I haven't found myself making a nice interface for every function. What I do (and what it sounds like most people do because, let's face it, we're lazy, and who wants to declare an interface every time you expect or store or return a function, especially when it's parametrized or just a simple one-off) is to declare:

type Fn[A, B] is (RefFn[A, B] | ValFn[A, B])
type BoxFn[A, B] is FnBox[A, B] box // to force it to be a box every time vs just default, because of course we want a callable function when we type it as a parameter or return value
interface FnBox[A, B]
  run box apply(x: A): B
type RefFn[A, B] is FnRef[A, B] ref // to force it to be a ref every time vs just default
interface FnRef[A, B]
  fun ref apply(x: A): B
type ValFn[A, B] is FnVal[A, B] val // to force it to be a val every time vs just default
interface FnVal[A, B]
  fun val apply(x: A): B
//... through all the capabilities
type Fn1[A, B, C] is (BoxFn1[A, B, C] | BoxFn2[A, B, C])
type BoxFn1[A, B, C] is FnBox1[A, B, C] box // to force it to be a box every time vs. just default
interface FnBox1[A, B, C] 
  fun box apply(a: A, b: B): C
type RefFn1[A, B, C] is FnRef1[A, B, C] ref // to force it to be a ref every time vs just default
interface FnRef1[A, B, C] 
  fun box apply(a: A, b: B): C
type ValFn1[A, B, C] is FnVal1[A, B, C] box // to force it to be a val every time vs just default
interface FnVal1[A, B, C]
  fun val apply(a: A, b: B): C
//... ad infinitum, or however many args you want to support (scala's fun21, maybe?)

essentially the poor-man's function type declaration syntax.

I'd propose somethiing a little different - it doesn't seem all that hard to parse (I can't claim to be an expert, so thoughts?), it's clear as to what it does, and in the majority of places, it reduces clutter. (v just some ruby string interp syntax:)

"#{FN_CAP --capability of fun apply() - default is val} (#{Arg0 type} -> #{Arg2 type} -> ... -> #{Return arg type}#{OBJ_CAP --capability of object - default is FN_CAP"

thus, for the bast majority of functional programs, inline function syntax is super easy:

fun val compose[A, B, C](fn1: (A -> B), fn2: (B -> C)): (A -> C) => //implementation

let's say you want something more complex - you're writing a foreach over your custom collection and you want to support functions with ref members too, so that foldl becomes easy - pass in an accumulator that keeps the result in an internal var, and just applies the function it was constructed with ...:

// this function accepts vals and refs!!
fun val foreach(fn1: trn(A)) => //Implementation

trn? that's right, trn! (it is useful, after all :p ) the standard subtyping graph for capabilities: tag --> box --> ref -------v |----> val ---> trn --> iso actually ends up flipped (!) for function types - (if you can call fun iso apply() on a thing, you can call any other capability (never mind aliasing for now) function on that thing), so the new graph is: iso --> trn --> ref -------v |-----> val ---> box ---> tag

This gives us some nice categories of types - where the capability reflects what will happen to the type between, and due to our calls: iso(A -> B) // calling the function consumes it trn(A -> B) // Only our calls can change function behavior - (others can read from it though!) ref(A -> B) // Function behavior may change at any time (our calls or otherwise) val(A -> B) // Function behavior will never change at any time - function is shareable with any actor! box(A -> B) // Locally immutable calls - our calls will never change function behavior tag(A) // ... we can call it

by having a sensible default for capabilities when not specified (FN_CAP = val, OBJ_CAP = FN_CAP), we can have a syntax that is simple, easy to use, and looks nice.

fun val decode(x: Seq[U64], fn: (U64 -> U64)): Seq[U64]
fun val decode_with_changing_func(x: Seq[U64], fn: ref(U64 -> U64)): Seq[U64]

vs.

fun val decode(x: Seq[U64], fn: U64Decoder val): Seq[U64]
fun val decode_with_changing_function(x: Seq[U64], fn: U64MutableDecoder ref): Seq[U64]
//... later in code
interface U64Decoder
  fun val apply(x: U64): U64
interface U64MutableDecoder
  fun ref apply(x: U64): U64

vs. what everyone will do anyway in the absence of a nice syntax for this

fun val decode(x: Seq[U64], fn: ValFn[U64, U64]): Seq[U64]
fun val decode_with_changing_func(x: Seq[U64], fn: RefFn[U64, U64]): Seq[U64]

I find the first option easier to read, easier to write, and if I need that function type more than once, I can just yank the type declaration and paste it into:

type U64Decoder is (U64 -> U64) // Makes the type theoretician in me sing with joy

Sorry for the long post - I've just done a lot of thinking on this subject, and it's something I'm really interested in having.

Thoughts?

ryanai3 commented 8 years ago

Partial functions would have a ? In the inline, and the parens allows us to nest function types a la Haskell: ex: fn: (A -> (A -> B) ? -> B)

The reason a (a, b, c) => D doesn't work is because it looks very similar to a case clause in a match on a tuple

andymcn commented 8 years ago

@ryanai3, interesting what you say about having to declare the type of lambdas. "In theory, inline lambdas are a source of clutter - it might be more clear to write the lambda or anonymous object separately and refer to it." I completely agree and would not be at all sorry if we did things like that. Originally we didn't have lambdas (although we did allow inline object literals) and only added them after people complained about having to type too much :)

Purely from a parsing perspective we couldn't use a syntax like (U64 -> U64), at least not without changing a lot of the existing syntax. So let's say that hypothetically we decided to add inline function type syntax, what could we use?

Let's consider a function taking a U64 and a Bool parameters, returning a U32 and being partial, ie like:

fun box apply(x: U64, y: Bool): U32 ?

Firstly, remember that in Pony a function type is really a special case of an interface. So one option is to make interface types inlinable and just use that. Possibly we'd need a separate keyword to interface, so for now I'll use itype. Our example would then be:

itype fun box apply(x: U64, y: Bool): U32 ? ref

That's not very good, but it gives us a starting point to compare to.

Note that we do need 2 capabilities. There's what kind of reference we have to the function object (ref in this example), which we need to know for doing anything with the reference other than simply calling it, ie taking copies, passing it to other functions, sending it to other actors, etc. There's also the receiver capability (box in this example) which we need to know to tell if we can call the function given the kind of reference we have. Remember that Pony lambdas can have mutable state and so may need a ref receiver. They can also have no state, ie tag, they can be deeply immutable, ie val, or they can just be box. This is all relevant if you want to pass your function to another actor.

So what's the minimal function type syntax we can get? To work with Pony's grammar we'd need to start with a new keyword (or possibly a symbol). We can't reuse fun, so for now I'll use fntype. fn, ftype, etc are also options.

We can't use -> since we already use that for viewpoint adaptation. Also Haskell style syntax would give us problems distinguishing between a function taking one parameter and returning a value, and a function taking two parameters with no return value. For this reason I personally consider the Haskell syntax to be one of the worst function syntaxes ever invented. The very worst is C's function pointer syntax.

So let's try something more Pony-esque. How about:

fntype box(U64, Bool): U32 ? ref

Unfortunately the trailing ? and return type are ambiguous in the case of something returning a function type. So if we move them inside the parentheses we get:

fntype box(U64, Bool: U32 ?) ref

Assuming sensible default capabilities the common case becomes:

fntype(U64, Bool: U32 ?)

That is the simplest form I've up with that works and it's clearly better than the inline interface example above. Compare it to your generic approach:

FnBox2P[U64, Bool, U32]

(the P indicates partial).

Subject to a sensible naming scheme for the generics, is the fntype approach much better? Would putting FnBox2P etc in the standard library be better?

ryanai3 commented 8 years ago

@andymcn

"the very worst is C's function pointer syntax."

With you on that one!

After some informal surveys of my hardcore functional friends, fn was the most popular, with "use some random symbol" a distant second.

fntype is a bit long, and fn strikes a nice balance between readable and short.

I definitely think that an inline fn , or $ approach works better - it's more readable, it's more obvious what it's doing, and it's eyecatching.

w/ regards to the syntax, if it's not too messy from a parsing perspective, I'd propose:

fn (U64, Bool ? => U32) vs the fn (U64, Bool : U32 ?)

I think it's clearer at first pass, especially since it looks very similar to the tail of the function definition syntax.

I'd propose a different set of defaults for the capabilities - the capability on the function should be a val by default, since the most common use of passing around lambdas and typing on them is probably going to be pure functions in the immutable sense anyway. Either way, the capability of the lambda itself (the second one in the syntax) should default to whatever value the first one has - often you only care about being able to call the function, not that it's specifically a val or an iso.

Maybe I misunderstand subtyping with functions, but if you want to admit both a pony val (U64, Bool => U32) and a pony ref (U64, Bool => U32)

Wouldn't the type be pony trn (U64, Bool => U32) and not pony box (U64, Bool => U32) ?

since, after writing out the object capability, you get: pony trn (U64, Bool => U32) trn pony box(U64, Bool => U32) box

and you can call a ref or a val on a trn, but you can't do the same with a box?

jemc commented 8 years ago

Assuming sensible default capabilities the common case becomes:

fntype(U64, Bool: U32 ?)

That is the simplest form I've up with that works and it's clearly better than the inline interface example above. Compare it to your generic approach:

FnBox2P[U64, Bool, U32]

(the P indicates partial).

Subject to a sensible naming scheme for the generics, is the fntype approach much better? Would putting FnBox2P etc in the standard library be better?

At first glance, the proposed inline syntax here is off-putting to me - having the return type and partial marker ? inside the parenthesis is unsettling (arguments/parameters are supposed to be inside parenthesis), and the Bool: U32 part is particularly confusing, as an otherwise Pony-attuned brain would try to interpret U32 as the type of Bool.

My initial thought here is that the special inline syntax is not really saving any trouble compared to FnBox2P[U64, Bool, U32] (note that the two are roughly the same number of characters, depending on how much whitespace you use), so it's not worth the cognitive overhead of having to remember a different set of nested syntax rules in a complex function signature. That is, since FnBox2P[U64, Bool, U32] already works using syntax consistent with what we already have, what's the advantage of adding more syntax?

ryanai3 commented 8 years ago

It can get confusing very quickly, especially for a lot of the functional style people like to do.

// some hypothetical function that takes in a U64, and a bool and gives you back a function from U64 to U32's FnBox2P[U64, Bool, FnVal1[U64, U32]] vs: fn (U64, Bool ? => fn val(U64 => U32) )

or, let's say a Weird zip function that zips partials together and uses the first argument as a default in the case that either fail.

FnBox3[B, FnVal1P[A, B], FnVal1P[A, B], FnVal1[A, B]] vs. fn box(B, fn (A ? => B), fn (A ? => B) => fn (A => B))

the second example is definitely not something you'd be doing, but you can see how the FnBoxX approach quickly becomes hard to read. they're about the same length, and that's with nice spacing on the second one too.

I'd argue that the inline syntax is a lot clearer, and it's easier to edit, since facts about the function (capabilities, whether it's partial or not, return type or not, how many values or not) are explicit in the syntax, and not something we need to figure out by looking at the function name, and then going through the comma separated types that have no distinguishing features between, types of arguments and return types

jemc commented 8 years ago

I must confess that in each of your examples, the upper is much easier for my eyes to parse :grin:, especially when it comes to the => at different nest levels.

ryanai3 commented 8 years ago

maybe I've just written too much haskell :p box(B -> (A ? -> B) -> (A ?-> B) -> (A ->B))

andymcn commented 8 years ago

I agree with @jemc that the FnBox3 style examples are easier for my brain to parse than the Haskell style ones. I think it does depend on whether you're used to writing functional code or not. box(B -> (A ? -> B) -> (A ?-> B) -> (A ->B)) in particular looks like something you'd see on a blockboard in a maths lecture, not code at all.

@ryanai3, you seem to have the capabilties for functions the wrong way round some how. A function capabilty says what the function is going to do to the receiver, eg box means "I am going to read your fields, but not write them or send you", which means any capability other than tag is fine. And iso means "I am going to read and write your fields and send you", so the receiver had better be iso.

ryanai3 commented 8 years ago

@andymcn. Right, so if you've got a box, you can't call ref methods on it. which means if you want to accept lambdas with a ref apply and a val apply, your parameter can't be a box, right?

It's should be the least restricted type that you could call a ref and a val on, a trn

so a function fun do_something_with(f: box(A => B)) cannot accept a ref(A => B) ?

when I was trying to mess around with some fp stuff, I noticed that all interfaces with a simple fun SOMECAP apply were subtypes of

interface ISOFUN[A, B]
    fun iso apply(a: A): B
ryanai3 commented 8 years ago

@andymcn @jemc

looking back over it, there's another inconsistency in constructing types that pertains to this:

how many times have you done:

let tup: (U64, U64, Bool, Wombat)

if we're going to be purists about type constructor syntax, this should be

let tup: Tuple4[U64, U64, Bool, Wombat]

there's nothing clearer than this! (tuple is even in the name! also the number of arguments).

it even looks better when we have nesting:

let tup2: Tuple3[Tuple2[U64, U64], Tuple3[Bool, Wombat, U32], String]

vs.

let tup2: ((U64, U64), (Bool, Wombat, U32), String)

The first syntax is absolutely simpler for people who aren't all that used to messing with tuples.

so that's tuples (cartesian products)

we've also got some other quirks around type constructor syntax:

type unions:

type Maybe[A] is (Just[A] | None)

maybe this should be:

type Maybe[A] is (Either[Just[A], None])

I really don't see what makes tuples a special case over functions:

  1. The same argument applies -> If someone is taking, or returning, or storing a tuple, they're probably using it more than once. People also have IDE's and jump-to. Lets go the java way and force them to create a bunch of wrapper objects.
  2. Why have an inline syntax for it? it's potentially confusing and makes parsing harder. Let's just stick Tuple1, Tuple2, Tuple3, Tuple4, Tuple5 etc. into the standard library.

In my mind, if you want to support functional programming idioms and have that be easily done in pony, you need an inline function type constructor. Also, I feel like you're cheating yourself a bit when you offer easy type constructors for type unions and intersections, for tuples, but not for functions.

Haskell gets away with the easy '->' because there is no such thing as a function that doesn't return anything. Because to have side effects, you need to deal with a state monad or something. Scala has solved this problem. Functions that don't return anything return Unit (or in our case, maybe None)

Given the fact that the type constructor for tuples have to be parsed on just seeing parentheses where one expects a type, I'd love to see something like:

let fn: (U64 -> Bool ? -> U32)
let fn2: (U64 -> Bool -> Wombat -> Unit)

but given viewpoint adaptation already uses ->, this makes things awfully ambiguous.

We can go with a more haskell style:

let fn: (U64 => Bool => Wombat ? => Unit)
let fn2: box(U64 => Bool => Wombat) iso

or a more scala style:

let fn: (U64, Bool, Wombat ?=> Unit)
let fn2: box(U64, Bool, Wombat) iso

or:

let fn: (U64, Bool, Wombat: Unit ?)
let fn2: box(U64, Bool, Wombat) iso

if we absolutely need a delimiter to tell us that whats coming is a function type, It should be lightweight, i.e.: fn(U64 => Bool) or $(U64 => Bool)

Sorry if I sound a bit harsh, but this is something I find very important.

Thoughts?

andymcn commented 8 years ago

@ryanai3, please don't worry about sounding too harsh, we want to get real users' opinions, even when they don't agree with ours, in fact especially then! But similarly, realise that we aren't necessarily going to come round to your point of view.

What you say about unions, intersections and tuples is interesting. However, they all have fundamental differences to other types in the language.

Unions and intersections are both unordered and flattenable. That is: (A | B) is the same type as (B | A) and (A | (B | C)) in the same type as (A | B | C) Neither of these things is true for normal generics. MyType[A, B] is not the same type as MyType[B, A] unless A and B are the same.

Tuples are ordered and not flattenable. However, they are passed by value, whereas normal Pony types are passed by reference. This means that tuples are not actually objects that are constructed as such, they are just a collection of things that happen to be passed around together for now. It also means tuples do not have a capability and cannot be compared by identity, only their elements do/can.

With regard to your suggested function type syntax, I'm afraid that all of your examples that don't start with a new keyword / symbol are ambiguous in one way or another. This is partly because of the extra restrictions we put on the Pony grammar to keep parsing "simple" (a term I'll keep vague here as the explanation is rather complex, but it boils down to being able to tell what a symbol means when we see it, rather than only finding out much later).

I doubt we'd ever use a leading symbol for this because there aren't many symbols left unused ($ is actually used internally and so not available), plus it's not obvious enough what it means. For a keyword we'd probably want something longer than fn and preferably including "type" to avoid confusion with fun. In general we prefer to make things clearer to read rather than save a few characters of typing.

We are also very unlikely to ever use Haskell style syntax. Personally I consider currying to be a terrible idea and find it makes things very confusing for non-functional programmers. Pony will always make the distinction between parameters and return values.

When you say that "you need an inline function type" syntax, I'm a little dubious. This is purely syntax we're talking about here. None of this will allow you to do anything new with the language, it'll just allow you to do existing things more tersely.

It's probably also worth pointing out that Pony is not intended to be a functional language. There are some great aspects of functional programming and we don't want to do anything that would make them impossible (unless there's a very good reason). But we do not explicitly favour, or strive to fully embrace, the functional programming idiom, or indeed any other existing idiom.

ryanai3 commented 8 years ago

@andymcn My points about tuples were not about tuples themselves, but rather the type construction - replace "wrapper object" with "predefined struct val" - the point is to question why exactly tuples have an inline syntax when they could be specified by Tuple3[A, B, C]? Why do we let people declare tuples just via parens and commas? Why don't they put a tuple keyword before it. What exactly makes them special? It's just more terse, after all. Everything you can do with tuples you can do without them. Heck, since you're probably using the same tuple often, you should probably write a wrapper class/struct for it - it's just a single key combo away.

My answer? For a lot of one-offs, for a lot of prototyping, it's easier and faster to write the types for tuples and hand around tuples than write a 5-line wrapper object (or struct) for every single different tuple you use. Often it can also be more clear, and more readable, and we already know that one of the two hard things in computer science is naming things, I don't like naming a million and one wrapper classes :( .

All the above points apply for inline function types.

It's not about being a functional language - it's about supporting functional idioms. Hell, C has a function type syntax (granted, it's pointers, and it's awful), but no one in their right mind would call C a functional language. But it's incredibly useful when you need it. I'm making the claim that to support functional idioms well you really need a couple things:

  1. First class functions (obviously)
  2. if you're going to be statically typed, that needs to be easy.

    I agree with you, I don't think Pony should be Haskell 2.0. But functional idioms are incredibly useful. When I think of languages that support functional idioms (go, scala, python, ruby, nim, erlang, etc...) and those that don't (java?), I can't think of a single one that supports functional idioms that doesn't have inline function type syntax (or is dynamic, so moot point), and I can't think of a single language that doesn't support functional idioms that has an inline function type syntax. There's a reason the two go hand in hand.

Never mind java 8, why is it impractical to do something like (or implement the logic behind):

   let words: Seq[String] = mySeq.map(this~gen_list(size))
  .flat_map(decoder~())
  .splitAt(encoder~is_delimiter())
  .map(this~section2string())
  .join(" ")

in java?

It's not just that it's "more typing" and my hands get tired.You can actually type on and pass around functions through interfaces (sounds familiar). So therefore, theoretically, this is all stuff you can do in java, it's just "less terse". Practically, no one does this in java ever because it's extra typing, extra cognitive load (okay, this is expecting a WombatHungryFn, and I use intellij and it tells me it's an interface that has an apply function that returns a boolean by taking in a wombat and an int, okay and the next function wants ...), and working around a language is never fun. Having inline function types also has positives - tidier namespace of interfaces, easy visual view of what the function actually is, works well for one off functions.

Breakdown:

So you can say it's still possible. Technically, it is just more typing, but it's the same case for java. If someone was making the argument to me that java (pre java 8 of course)is fine with supporting functional idioms, I'd disagree (heavily), although technically it's still possible, just more typing.

That's why the problem doesn't seem as big as it is. from the client side, there is no problem (unless stuff returns functions), and when we write lambdas types are inferred - but that's just a weird special case since nowhere else in the language are types inferred.

Scala actually used to use Function0[RET], Function1[A, RET], Function2[A, B, RET], etc. Now they have the nice fn: (Arg 1, Arg2, ...) => Ret syntax (that actually gets replaced by FunctionX[...,...,...] under the hood).

We've got primitives, first class functions (yes, they're objects, but lambda vals are turned into primitives), and we can define them inline! We've got tuples, and easy tuple typing (no Tuple1[U64], Tuple2[U64, Bool], Tuple3[Bool, U64, String] nonsense). Pony is almost all the way there.

For now, everyone writing code with these functional idioms is just going to use a file containing "FnBox1, FnBoxP1, FnVal2...." and when/if pony gets macros/some way of hacking on the AST, some enterprising person is going to write a fn macro that expands:

let k: fnp(ref, U64, U64, Bool, iso)
into FnRef2P[U64, U64, Bool] iso

whereas we all might be happier with: let k: fntype ref(U64, U64 ? => Bool) iso (I hate to admit it, but "fntype" is starting to grow on me)

In my mind, there's a reason this has kept coming up in discussions - and I can't imagine that it won't come up again in the future. It's the difference between a java-level of function specification support and a scala/haskell/go/python/etc. level of function specification support.

andymcn commented 8 years ago

@ryanai3, rather surprisingly you have convinced me, which I was not expecting to happen. So yes, we will add inline function type syntax, which will actually be very simple sugar for an anonymous interface type with 1 function called "apply".

I've had a long discussion with @sylvanc about exactly what syntax to use and we've failed to come to a conclusion. The key problem is that the way we'd naturally like to do it is ambiguous. There are plenty of possibilities, but none feels exactly right. So I'm going to post the most sensible options here for comments.

For each option I'll give a full example with all the optional stuff and 2 really simple examples. They are equivalent to the types for:

fun ref apply[A: Bar](x: U64, y: Foo): Bool ?

fun apply(x: U64)

fun apply(): Bool

You may also need to specify the capability of the reference you have to the function. I'll add this as iso in the complex example and omit it (use the default) in the simple examples.

Option 1, starting with a keyword.

This is basically the format I've been using throughout this issue.

funtype ref[A: Bar](U64, Foo => Bool ?) iso

funtype (U64)

funtype (=> Bool)

This has the advantage that the use of a keyword makes it easy to spot when glancing at unfamiliar code.

The disadvantage is that the return type has to go within the parentheses and we need to use => for the return type rather than :. Both of these are different to Pony function syntax.

There's also the question of what keyword to use. This is a trade-off between brevity and clarity. I think funtype is the clearest and most obvious what it means. It is however rather long. fn is the shortest practical option, but it doesn't convey that it's a type and also I worry people will forget when to use fun and when to use fn.

From a compiler point of view it would be acceptable to use fun. However I think that would also be confusing when glancing at code. It would also be annoying for auto-indenting type tools and makes it harder for the parser to resync after errors.

Option 2, starting with a keyword and a having a terminator.

As option 1, but ending with end. This would allow us to move the return type outside the parens and use start Pony function syntax.

funtype ref[A: Bar](U64, Foo): Bool ? end iso

funtype (U64) end

funtype (=> Bool) end

The downside is this makes things quite long, especially in the simple cases.

Option 3, delimiting symbols.

There are various symbols we could use to mark the beginning and end without ambiguity for the compiler.

[ref[A: Bar](U64, Foo): Bool ?] iso

[(U64)]

[()=> Bool]

Again this has the advantage of using the existing return type syntax, plus it is the shortest possible.

One disadvantage is that it doesn't inherently tell you it's a type. Worse is that there aren't any standard symbols free, so we'd have to overload something which is always annoying as it makes it less clear what code is doing at a glance.

I worry that using [ and ] will confuse people with regard to type parameters, especially where you have a function type that takes type parameters (or a type parameter bound to a function type).

We could use other symbols, and they could be the same for the start and end, eg:

\ref[A: Bar](U64, Foo): Bool ?\ iso

\(U64)\

\()=> Bool\

One question which has occurred to me whilst writing this, is do we want to be able to similarly specify a behaviour type? This could be done trivially is we went for options 1 or 2, eg:

betype ref[A: Bar](U64, Foo) end

betype (U64) end

If we went for option 3 we'd have to add something extra, fun and be would be the obvious choices:

[fun ref[A: Bar](U64, Foo): Bool ?] iso
[be ref[A: Bar](U64, Foo)]

[fun (U64)]
[be (U64)]

That feels rather odd to me and is still a potential problem for auto-indenters and parser resyncing.

jemc commented 8 years ago

With that in mind, Option 1 would be my preference - the end in Option 2 is verbose and likely to be confusing seeing an end inside a type within another expression. It doesn't seem like there are any good delimiting symbols for Option 3 (in my opinion).

I think funtype (and betype) are likely to be easier to understand/remember/differentiate compared to thinking of fun vs fn.

ryanai3 commented 8 years ago

@andymcn @sylvanc Option 2 is a bit too verbose for me. As much as "funtype" has grown on me, I like option 3 more.

Why? It's 100% no confusion clear what it is. It looks exactly like a function or behavior signature, and even with the "fun" or "be", it's shorter than Options 1 or 2. So it wins from a clarity perspective, from a "remembering the syntax" perspective, from a terseness perspective, and I assume it's more elegant to fit into the parser (just parse it as a function signature)

You're right that it's a bit odd to use [] for the delimiters, given that they're used for type parameters. They can definitely get confusing in the case of something like [[A: Bar, C: Stringable](A, U64, C): String

What if we used a different delimiter ;) - {} because we only plan to use curly braces for array and hash literals, it's a lot less ambiguous. If a curly brace expression is in the place we would expect a type, it's a function or behavior type. If it's in the place we would expect an expression (after an equals, passed into a method, etc.), then it's a hash or array. It's different than [] and (), so there is little confusion when tuples or parametrization is involved in the function type, and it looks cool. plus, types are technically literals, right?

When you see:

{fun ref [A: Bar](U64, Foo): Bool ?} iso {be (U64)} {fun (): Bool}

you know exactly what's happening.

quick question which pertains to defaults: if we have the interfaces:

interface FnRef
  fun ref apply(x: U64): U64

interface FnBox
  fun ref apply(x: U64): U64

corresponding to: {fun ref(U64): U64)} ref (A) and {fun box(U64): U64} box (B) which is true?

  1. A subtypeof B or
  2. B subtypeof A.

I ask, because it seems from our discussion that 1 is the case. However, the pony compiler tells me that 2 is the case. Being able to call an ref-fun implies being able to call a box-fun, but being able to call a box-fun does not imply being able to call a ref-fun. Therefore, isn't 2 true?

And therefore, doesn't this affect defaults? i.e., we would want the default to not be "box" but rather trn, since {fun trn(U64): U64 trn}is a supertype of A and B?

andymcn commented 8 years ago

Using { } for both types and array literals is an interesting option. I'll look into it.

@ryanai3, re your subtyping question, neither of A and B are subtypes of each other. There are 2 capabilities involved, the receiver capability for apply and the reference to the function object that you have. These do not have to be the same as far as the type system is concerned and they subtype in opposite directions.

Note that I believe there is a typo in your example, both FnRef and FnBox have ref receiver capability on apply, making the 2 types identical. This may have affected what the compiler told you about subtyping relationships if that error was repeated in your experimentation code.

For clarity I'm assuming:

interface FnRef
  fun ref apply(x: U64): U64

interface FnBox
  fun box apply(x: U64): U64

And I have: A. {fun ref(U64): U64)} ref, ie FnRef ref B. {fun box(U64): U64)} box, ie FnBox box

Case 1. If I have a ref reference to some object then I have the right to mutate it's (public) internal state, so a box object will not do instead. Therefore if I ask for an A you can not give me a B instead.

Case 2. If I have a box reference to some object then I am not allowed to call a ref function on it. Therefore if I ask for a B you can not give me an A instead.

So A and B are unrelated types.

The supertype for such entities would actually be {fun trn(U64): U64} box, which would be useless as you couldn't call apply on it.

The only way to get a supertype for A and B is a union of them: (A | B). Currently you can't call apply on such a construct, but you will be able to when I've implemented issue #318, which hopefully will be in the next few weeks.

To have direct support for subtyping function types they would have to be a new construct in the type system, rather than just aliases for interfaces. It's unclear without further investigation whether this would be possible at all.

ryanai3 commented 8 years ago

@andymcn I see now. That makes a lot more sense. When I was looking at subtyping, I was ignoring the capability on the lambda itself. of course {fun box(A): B} ref is a subtype of {fun ref(A): B} ref, but not when you care about calling the function.

the sort of "use case" for function subtyping was in writing scala style foreach, map, filter, etc, where the function passed in can be a ref or a val or a box (anything but an iso really) since all you care about is being able to call it. in a par_map, par_filter, etc. you expect a val function which is an easy {fun val(A): B}. when you want a ref or val, you need: {fun ref(A): B} | {fun val(A): B} in hindsight, that isn't so bad, and if it is, probably time for a type alias.

just want to comment that there's some really interesting patterns that could result from this - calling .map(some_func) where the behavior of some_func can change based on the collection it's being called on.

other than that, which is an easily solved problem once calling on unions works, I don't see a use-case for subtyping function types.

Also, just wanted to say: Thanks for being so open to discussion around this! :)

andymcn commented 8 years ago

Nice names for lambda types in error reports and inline function syntax are now both in. There's a trivial example of both in examples/lambda/lambda.pony.

For lambdas, instead of appearing in error reports like:

$0$4 val

they're now like:

lambda(U32, U32): U32 ? end val

Obviously this format can be tweaked if people have suggestions to make it more readable.

For inline function syntax I've gone with surrounding { }, no keyword and return syntax using :, as for actual functions. Example:

{box (U32, U32): Bool ?} val

In the above the box is the receiver capability of the object's apply function and the val is the capability of the reference to the object. Both are optional.

The simplest possible inline function type, that takes no arguments or type arguments, cannot raise an error and has no return type, would be:

{()}

I believe that's everything concluded from this issue, so I'm going to close it. However, the discussion was quite long and did go off at a few tangents. If anyone thinks something here is still unresolved then please open a new, more specific, issue.