cqcallaw / newt

The newt programming language
GNU General Public License v3.0
12 stars 2 forks source link

Recursive function invocation #3

Closed cqcallaw closed 7 years ago

cqcallaw commented 8 years ago

It is not currently possible to invoke a function from within the function's body

cqcallaw commented 8 years ago

From @laokaplow: must test that a recursively defined function works as expected when the function name shadows a variable in an outer scope.

cqcallaw commented 8 years ago

May also want to test/support mutually recursive functions and think about validation concerns. For example

a := () -> int {
   var:= b()
   return var
}

b := () -> int {
   return a()
}

is fine but

a := () -> int {
   var:= b()
   return var
}

b := () -> double {
   return a()
}

is not.

cqcallaw commented 8 years ago

@laokaplow observes that the previous example of mutual recursion does not preserve the property of declaration before use (the function b is used before it is defined). A formulation that preserves this property follows.

a:() -> int
b:() -> int

a = () -> int {
   var:= b()
   return var
}

b = () -> int {
   return a()
}

This appears to be semantically equivalent to forward declaration of the functions in question.

laokaplow commented 8 years ago

One way to absolve your users of having to write what are essentially forward-declarations is to do it for them.

Javascript does something like this. They call it "hoisting". Declarations there are considered to occur at the top of whatever block they are in, while the initialization (if any) remains in the original position as an assignment. In Javascript, variables are initialized with the value "undefined".

In newt, if you were to pursue such a strategy, you might initialize any variables hoisted in this way to their type's default value.

There are pros and cons to this approach.

Two immediate cons come to mind:

1) It would be impossible to directly refer to an outer variable if it is shadowed anywhere in the current scope. (workaround, assign it a different name in an intermediate scope, introducing an intermediate scope if necessary, then referring to that different name).

2) It changes the semantics of "must be defined before use". Previously, if a programmer declared a variable and provided an initial value, they knew that any use of that variable refer to that initial value until the value of the variable was changed. With hoisting it would be possible for a programmer to accidentally refer to the variable before it was specifically initialized, resulting in the default value.

If you implement declaration hoisting but either [a) hoist initialization expressions alongside declarations, or b) validate that all references to variables occur only after the provided initialization] then you are back to square one. With 'a, users would have to declare and define their mutually recursive functions separately anyway. With 'b, it would be very difficult to do the verification, but I can see it being tractable in some cases.

Also, it wouldn't work very well with constants.

laokaplow commented 8 years ago

Another approach is similar to one above, but would introduce a new special type of declaration and only do hoisting on those.

This notion of having a second way of declaring values (especially functions) is similar to some of the ideas we have talked about for supporting overloaded functions.

cqcallaw commented 8 years ago

Diving into the implementation of this, it has become apparent that using the function name as the subject of invocation does not cover all cases. Anonymous, immediately invoked functions are one example. Functions bounded to a type-inferred variable are another example that is less obvious but more common; in this case the function expression is preprocessed before the type inference and variable declaration occurs, so using the name doesn't have much semantic meaning.

cqcallaw commented 8 years ago

One solution for the previously mentioned edge cases is a symbol or keyword indicating the invocation is in fact an invocation of the surrounding function context.

cqcallaw commented 8 years ago

An issue with the "invoke surrounding context" symbol: if contexts are nested, which surrounding context is referenced by the symbol? What if (for example) we wish to invoke the grandparent function, instead of the parent function?

cqcallaw commented 8 years ago

The implementation of the symbol/keyword could walk up the context chain until the root is reached, or a function with a matching signature is found. However, Lao and I are generally of the opinion that this is gross and the symbol/keyword should reference the most-specific enclosing function context. If the argument list does not match the function signature, it's a semantic error.

cqcallaw commented 7 years ago

Commit 1242b8b6d369be0ccadb2c9d103aff909f2e0394 adds the basic functionality here. Recursively invoked anonymous functions are not supported, but functions bound to type-inferred variables are.

cqcallaw commented 7 years ago

Forward declaration seems like an acceptable solution to the problem of mutual recursion, and recursive anonymous functions do not seem like a compelling enough feature to implement at this time.

Infinite recursion is not yet handled well by the runtime.

cqcallaw commented 7 years ago

Infinite recursion fixed in 15624f372d3da52165a33f9cf1782623684ad921 and e76d8562ce865029065973807a6f5e95da93ab1e