Ambiguous functions declaration, overloading and shadowing

yloiseau commented 8 years ago

Synthesis of comments on #312 #313 Linked to #320 Probably linked to #275

We have an issue regarding the resolution of overloaded functions, more specifically when using variable arity ones (varargs)

A little history

When using the literal notation for function references (^module::function) or the Predefined::fun function without specifying the function arity (fun("function", module.class) or fun("function", module.class, -1)), for which the literal notation is just syntactic sugar, if the referenced function was overloaded, that is the function has multiple definitions with different arities, the return function reference used to be undefined. The first found was return, but with no guaranty on with this first one would be.

For instance, with code like

function foo = |x| -> ...

function foo = |x, y| -> ...

let f = fun("foo", currentModule.class)

f could be either of the two functions.

A correction was applied to raise a AmbiguousFunctionReferenceException in this case, to avoid guessing. This force the developer to be explicit on the required arity by using fun("foo", currentModule.class, 1) or fun("foo", currentModule.class, 2), instead of silently using one or the other.

The problem is not present when calling the function directly as in foo(1) or foo(1, 2), since in this case, the number of arguments is known. This resolution takes place in runtime.FunctionCallSupport::findStaticMethodOrField.

As a side note, their is some logic duplication between runtime.FunctionCallSupport::findStaticMethodOrField and Predefined::fun (and probably also in the method call resolution).

Current problem

However, the direct call resolution use the number of arguments used on the call site, and try to match them with the number of parameters of the candidate function.

This raise an issue when the function is overloaded with a variable arity that shadow the fixed one. For instance, given the functions:

function foo = |x| -> ...

function foo = |x, y| -> ...

function foo = |xs...| -> ...

when using fun, the third one shadows the first one, and thus fun("foo", mod.class, 1) throws an exception;
when doing a direct call, the third one shadows the two others, depending on the number of arguments given: foo(1) can call the first or the third, and fun(1, 2) can call the second or the third.
Some tests

given

module Test

function foo = -> "0"
function foo = |x| -> "1"
function foo = |x, y| -> "2"
function foo = |x...| -> "1v"
function foo = |x, y...| -> "2v"

function test = |msg, fn| {
    try {                  
    println(msg + fn())
  } catch (e) {
    println(msg + "failed with " + e: getClass(): getName())
  }

}

function main = |args| {

  println("# run")

  # can be 0 or 1v
  test("direct 0: ", -> foo())

  # can be 1 or 1v or 2v
  test("direct 1: ", -> foo(1))

  # can be 2 or 1v or 2v
  test("direct 2: ", -> foo(1, 2))

  # can be 1v or 2v
  test("direct 3: ", -> foo(1, 2, 3))

  # ok, always 0
  test("ref 0: ", -> fun("foo", Test.class, 0)())

  # can be 1 or 1v
  test("ref -1: ", -> fun("foo", Test.class, -1)(1))

  # can be 1 or 1v
  test("ref 1: ", -> fun("foo", Test.class, 1)(1))

  # can be 2 or 2v
  test("ref 2: ", -> fun("foo", Test.class, 2)(1, 2))
}

#!/bin/bash
for i in `seq $1` ; do
    golo golo --files test.golo
done | sort | uniq -c

This gave me:

$ ./test.sh 1000
    379 direct 0: 0
    621 direct 0: 1v
    741 direct 1: 1
    151 direct 1: 1v
    108 direct 1: 2v
     93 direct 2: 1v
    799 direct 2: 2
    108 direct 2: 2v
    784 direct 3: 1v
    216 direct 3: 2v
   1000 ref 0: 0
   1000 ref -1: failed with AmbiguousFunctionReferenceException
   1000 ref 1: failed with AmbiguousFunctionReferenceException
   1000 ref 2: failed with AmbiguousFunctionReferenceException
   1000 # run

This is not just a corner case!

Solutions

Note that the two issues, though related, are not exactly the same, and thus a different solution can be adopted for direct call and function reference.

Explicit specification

Add a parameter to fun to specify if we want variable or fixed arity. This would work when getting a function reference, but how do we resolve direct calls?

Multiple references proxy

Also only in the case of function references (e.g. fun with no arity), we could return a proxy containing all the ambiguous references instead of throwing a runtime exception. The resolution can then be done on the call site, in a manner similar to direct calls.

This does not solve the issue of ambiguous direct calls.

Implicit choice

Define a priority between functions, for instance when multiple choices among fixed and variable arity, always use the fixed one.

Optionally also define a priority when referencing with fun without arity instead of raising a runtime exception.

I’m not found of this solution. Their will surely be cases where the wrong function is used, so we need a way to be more explicit.

Runtime exception

As for the fun function, throw an AmbiguousFunctionCallException in FunctionCallSupport if the function is not fully defined. This does not really solve the problem, but at least it does not pass unnoticed.

Compile time exception

Refuse to guess, and throw a compilation exception if the developer has defined ambiguous functions, where a variable arity one can shadow a fixed one, forcing the developer to rethink its API.

yloiseau commented 8 years ago

ping @Artpej are you dealing with this ? If so, I had started a branch (fix/ambiguous-function ) that you can use if you like.

artpej commented 8 years ago

Thanks @yloiseau for this great summary :smiley: . I think this one is really a big one and there's many cases we have to take care about. Every solutions can be great, that's depend on the case.

Compile time exception

two functions with the same name and the same arity in the same module should throw a compilation exception
two functions with the same name and the same arity in the same augmentation should throw a compilation exception
two augmentations (for the same class) / structs / unions inside the same module should throw a compilation exception

Implicit choice

function could be resolved in this order
- current module (according name and arity)
- imported (according name, types for java module, arity with imports list order)
- current module (according name and arity for varargs)
- imported (according name and is varargs with imports list order)
- docorated (fixed arity that match)
- docorated (variable arity)
- predifined functions
methods could be resolved in this order (mainly the same than functions inside an augmentation) plus:
- named augmentations according the with order (named augmentations resolution order : current module, imports order)
- current module augment
- imported augments according the import list order
- predefined methods (struct, unions, arrays, ...)
- java methods (for java objects)

Multiple references proxy

usefull for function references (with unknown arity), order can be choosed at call time

Explicit specification

the \ notation can be usefull to fix the arity of a function reference

Compile time exception

I think we should avoid this case if it is possible

I should missed many cases and it is just an idea of the concept. Anyways we could maybe introduce a kind of simple rules engine to determine which funtion / method call at runtime :

callsite informations -> candidates (methods + metadatas) -> rules engine -> decision

WDYT?

jponge commented 8 years ago

@Artpej I fully agree with you.

yloiseau commented 8 years ago

@Artpej I don't get your 3rd point about augmentations raising compilation exception. Do you mean that

augment Foo {
   function bar = |this| -> ...
}
augment Foo {
   function plop = |this| -> ...
}

should raise an exception? If so, I disagree. If your point is to fail if the 2nd augment is a bar function, then I'm ok.

I started to refactor augmentation resolution, but with a different order. Mine is: target > scope > kind > order

where:

target: the augmented class. specific > generic, that is if Foo and Object are augmented with a function with the same signature, the one on Foo must always be choosen, wherever it's defined.
scope: the more local definition must be choosen, local > imported > call stack (call stack to be added, in progess)
kind: how the method was added: normal > named this allow to add a named augmentation (mixin-like) and redefine directly some of the methods. For instance:

augmentation Mixin = {
    function foo = |this| -> "default foo"
    function sayFoo = |this| { println("Hello " + this: foo() }
}

augment Foo with Mixin
augment Foo {
    function foo = |this| -> "redefined foo"
}

Foo(): sayFoo() # must print "Hello redefined foo"

order: every other thing being equal, the first one is used.

Currently, the “real” method are always used before augmentations, that is we can't override a native java method with an augmentation. According to @jponge it could lead to issues. We have to think hard not to create unpredictable behavior.

I think a debug option (command line option or environment variable) could be defined to trace in the runtime which method/augmentation/function was used (resolved), to help understant the behavior of the code if such a case arise.

artpej commented 8 years ago

augment Foo {
   function bar = |this| -> ...
}

# long code listing ...

augment Foo {
   function plop = |this| -> ...
}

That's right, i had in mind to throw an exception in this case because the previous seem's to me confusing: it can be difficult to find where the plop method is defined at posteriori. Not a technical thing, just a constraint in order to improve readability. This is just my current opinion, I had not deeply think about it :wink:.

Tow others things:

It seem's to me there nothing that garenty the method order inside a module (eg class)
should we care about java inheritance:

augment java.util.List {
   function plop = |this| -> ...
}

augment java.util.ArrayList {
   function plop = |this| -> ...
}

jponge commented 8 years ago

@yloiseau yes, augmentations must not override existing class methods. Also agreed on the proposed resolution order.

@Artpej in your last example on List and ArrayList:

let a = java.util.ArrayList()
let b = java.util.LinkedList()

then:

a: plop() should resolve the ArrayList augmentation, while b: plop() should resolve the one on List.

jponge commented 8 years ago

Regarding compilation exceptions then we should do what's most reasonable for a dynamically-typed language.

Given:

augment java.lang.String {
  function toString = |this| -> "Ah Ah Ah"
}

then obviously we are not raising an exception in the current implementation, and at runtime there is no way toString from the augmentation gets selected. It seems fine to me, although if we loaded class definitions in the compiler to do further checkings we would be wise to raise a warning or even an error.

I suppose what @Artpej had in mind is cases like:

augmentation A = {
  function plop = |this| -> 1
}

augmentation B = {
  function plop = |this| -> 2
}

augment java.lang.String with A, B

where there is ambiguity in resolving which foo from A or B shall be called. One option would be to raise a compilation error, but again you need to load definitions.

Perhaps what we really want is just precise runtime semantics where we say that the above code would compile, but calling "yo": foo() would resolve to the foo from A because of the augmentation application ordering.

yloiseau commented 8 years ago

@jponge That is the case if I recall correctly

yloiseau commented 8 years ago

quoting section 9.4. Named augmentations

When applying several named augmentations, they are used in the application order. For instance, if AugmentA and AugmentB define both the method meth, and we augment java.lang.String with AugmentA, AugmentB, then calling "": meth() will call AugmentA::meth.

jponge commented 8 years ago

@yloiseau indeed :+1:

yloiseau commented 8 years ago

@Artpej Not sure about the multiple augmentation thing. It depends how you want to organize your code. For instance:

# define the foo behavior on several objects.
augment Foo {
    function foo = ...
    function sayFoo = ...
}
augment Bar {
   function foo = ...
   function sayFoo = ...
}
...

# define the bar behavior on several objects.
augment Foo {
   function bar = ...
   function sayBar = ...
}
augment Bar = {
    function bar = ...
    function sayBar = ...
}

yloiseau commented 8 years ago

One use case I had in mind with the resolution order for augmentations was:

module Augmentation

augmentation Fooable = {
    function foo = |this| -> "foo " + this: bar()
    function isFooable = |this| -> true
}
augment java.lang.Object {
    function isFooable = |this| -> false
}

module Data

struct Foo = { bar }
augment Foo with Augmentation.Fooable

module Lib

function sayFoo = |f| -> case {
    when f: isFooable() { println("Foo: " + f: foo() }
    otherwise { println("not fooable") }
}

module Main

import Data
import Lib
import Augmentation

augment java.lang.String {
    function bar = |this| -> this
}
augment java.lang.String with Fooable

augment java.util.List with Fooable
augment java.util.List {
    function foo = |this| -> "ListFoo"
}

function main = |args| {
    sayFoo(Object())  # not fooable
    sayFoo(Foo(42))  # Foo: foo 42
    sayFoo(list[42])    # Foo: ListFoo
    sayFoo("bar")       # Foo: foo bar
}

yloiseau commented 8 years ago

This is not working by far

jponge commented 8 years ago

In this case it's perhaps better to wrap around union types than augmentations.

yloiseau commented 8 years ago

@jponge indeed, but this allow to make any existing java object Fooable... Kind of overriding oftype for augmentations.

yloiseau commented 8 years ago

or emulation of higher kinded types :smile:

yloiseau commented 8 years ago

I think we'll need to fix this by small steps, since like @Artpej said it's a big one, linked to several other issues

jponge commented 8 years ago

or emulation of higher kinded types

I could see it coming :-p

jponge commented 8 years ago

Fixed?

yloiseau commented 8 years ago

Some cases are fixed by previous PRs, but I'm not sure we cover all corner cases.

yloiseau commented 8 years ago

So where are we? To summarize (feel free to edit this comment to add cases and details):

Declaring golo functions and augmentations:
- [x] two functions with the same name and arity in the same module throw a compilation exception #330
- [x] two functions with the same name and arity in the same augmentation throw a compilation exception #330
- [ ] warning when an augmentation function is shadowed by a regular method, an thus is never used?
Calling functions and methods: mimic the java resolution order, i.e. more specific first, fixed arity before varargs:
- [x] well defined and deterministic resolution order for augmentations: #334
- [x] fixed arity before varargs in regular methods resolution: #334, #333
- [x] overloaded regular instance methods: #339
- [ ] well defined and deterministic resolution order for regular methods: #455
- [ ] well defined and deterministic resolution order for functions (waiting FunctionCallSupport refactoring started by @yloiseau)
  - [ ] overloaded regular static methods (functions): #341
  - [ ] fixed arity before varargs in regular static methods (functions)
  - [ ] import priorities for functions: fully qualified > simple name, local > imported, the first imported one wins (see also #340)
Referencing functions (fun and ^):
- [x] add a parameter to fun to specify if we want a fixed arity or a varargs function. Ambiguous semantic here: does the arity parameter of fun specify the real arity of the method/function (i.e. number or parameters of the function) or does it represent the intended number of arguments for the method/function (i.e. the number of arguments we will call the function with). This is the same for fixed arity, but not for varargs (fixed in #366: the arity parameter is the real function arity)
- [x] extended literal notation for references #312
- [ ] use a “multiple reference proxy” that will be resolved when called if the reference specification is ambiguous, instead of throwing a AmbiguousFunctionReferenceException (@Artpej ?)
Debugging:
- [ ] add a debugging parameter (command line option, property configuration, environment variable) that allow to trace which function/method/augmentation is used

jponge commented 8 years ago

Yep :+1:

yloiseau commented 8 years ago

While we are at it, another idea than can fall in this part... What about autocurry? :smile: For instance, given

function foo = |a| -> a + 1
function foo = |a, b, c| -> a + b + c

foo(1) will return 2, as currently, but foo(1, 2), instead of throwing WrongMethodTypeException, would return a partialized version of foo\3, such that foo(1, 2)(3) returns 6.

This automatic approach would change the behavior of all functions (including Java native static methods), and could also be added in method support (being regular ones or augmentations)

On an other hand, this is quite easy to implement in a curry HOF/decorator, and thus make it an explicit behavior.

What do you think ?

jponge commented 8 years ago

Autocurry is an intriguing feature :smile:

On the minus side returning a partially-applied function reference if you make an error and forget some arguments is a likely to cause hard to track errors, since the syntax is that of a regular call.

On the plus side: it would look fine with a free variable / placeholder parameter, like _ in Scala, which could lead to code like:

let f = |a, b, c| -> a + b + c

let g = f(_, 0, _)
println(g(1, 3))

yloiseau commented 8 years ago

Hum… interesting. So this approach could even take place at compile time, (e.g. in SugarExpansionVisitor) by replacing

let g = f(?, 0, ?)

by

let g = ^f: bindAt(1, 0)

or something similar (yes, I found ? more appealing than _, partly since I often use _ to represent a ignored parameter for instance).

This would moreover allow to easily partialize on any argument, while with autocurry you always partialize parameters in order. On the other hand, autocurry mixed with named arguments would allow to simply do

let g = f(b=0)

which reminds me the @danielpetisme idea on fluent partializing using named parameters (see https://github.com/eclipse/golo-lang/pull/250#issuecomment-77352718)

Maybe we should keep discussions on this topic for another issue :smile:

franckverrot commented 8 years ago

+1 for autocurrying!

— Sent from Mailbox

On Wed, Jan 6, 2016 at 2:01 PM, Yannick Loiseau notifications@github.com wrote:

Hum… interesting. So this approach could even take place at compile time, (e.g. in SugarExpansionVisitor) by replacing
let g = f(?, 0, ?)
by
let g = ^f: bindAt(1, 0)
or something similar (yes, I found ? more appealing than _, partly since I often use _ to represent a ignored parameter for instance). This would moreover allow to easily partialize on any argument, while with autocurry you always partialize parameters in order. On the other hand, autocurry mixed with named arguments would allow to simply do
let g = f(b=0)
which reminds me the @danielpetisme idea on fluent partializing using named parameters (see https://github.com/eclipse/golo-lang/pull/250#issuecomment-77352718)

Maybe we should keep discussions on this topic for another issue :smile:

Reply to this email directly or view it on GitHub: https://github.com/eclipse/golo-lang/issues/326#issuecomment-169476329

jponge commented 8 years ago

It would make sense when done with, say, ? :+1:

eclipse-archived / golo-lang