ozra / onyx-lang

The Onyx Programming Language
Other
97 stars 5 forks source link

Function / Method / Lambda Definition #11

Open ozra opened 8 years ago

ozra commented 8 years ago

Functions / Methods

Note: See https://github.com/ozra/onyx-lang/issues/11#issuecomment-217818212 for latest RFC suggestion!

This is a staple construct of any language, and as such I believe there will be some opinions on it.

I like it to be simple and straight forward to define functions and therefore prefer as little formalia as possible while still making them clearly distinguishable and have a clear beacon (the spatial perspective). At the moment, there are two different ways of defining them, as to not lock in to my personal preference alone, keeping it open for discussion.

Note, I use lisp-style separators in the examples; all the below could of course (as already pointed out) be written with snake or camel.

[edit: I forgot about generics - see separate issue about it instead]

  1. Function/Method name first (minimal formalia)
this-is-a-func(a-param) -> a-param + 1

this-is-another(a-param) ->
   a-param + 1

yet-one(a-typed-param Real, foo SomeType) ->
   foo.blargh = 47
   a-typed-param + 1.0
end

type MyType
   my-member I32 = 47

   a-method(a-param) -> abstract

   another-method(a-param) ->
      @my-member + a-param

   third-method() -> @my-member
end
  1. Prefixed with a keyword (most common in other current languages, some different keywords are allowed atm, they're all equal fn, def, etc.) - support for this syntax has been dropped - it provides no benefit!
fn this-is-a-func(a-param) -> a-param + 1

fn this-is-another(a-param) ->
   a-param + 1

def and-a-routine(a-typed-param Real, foo SomeType) ->
   foo.blargh = 47
   a-typed-param + 1.0
end

type MyType
   my-member Int32 = 47

   def a-method(a-param) -> abstract

   def another-method(a-param) ->
      @my-member + a-param

   def third-method() ->
      @my-member
end

Thoughts concerning both

Thoughts about (1):

Thoughts about (2):

Note: As mentioned above: this syntax has now been removed and deemed non-beneficial.

The main reason the keyword style is so common is not because "it's better" - it's because it's easier to write a parser for it!

More Details, Common to Both

Using (1)-syntax in examples.

Sugar for "Functions" Not Returning Any Value

Since this is an imperative language, non (usable) value returning functions are not too uncommon (most often as part of types, where they modify member data of self only). In order to avoid accidentally leaking internal state involuntarily through implicit returns, and not having to tediously and repeatedly type nil as final expression, an exclamation mark can be suffixed to the 'function-arrow'. Implying "action"/"command"/procedure/routine". This makes sure it ends with nil and sets return type to Nil. Inspired by LS.

my-mutating-only-routine(a Foo, flag) ->!
   a.some-flag = flag

Formal Parameters

more-formal-params(a Int32 = 47, b String = "Cool!"; c = Foo()) String|Int32 ->
   say "c is {c}"
   if b is "Dude" then a else b

After much pondering, I came to the conclusion that the best way to mark visibility is per function (as opposed to grouped), and as sleakly as possible: suffix the name with an asterisk for protected, two asterisks for private. No asterisk means public. Public is the default because of the "open ness and patchability" philosophy of Onyx.

GOTCHA: This is the opposite of Modula and Nim, where an asterisk designates public visibility and defaults to private.

type Foo
   i-am-public(x) -> stuff
   i-am-protected*(x) -> stuff
   i-am-private**(x) -> stuff

The asterisks are only typed at definition, not in calls (they're not part of the name).

It sort of looks like a footnote, sort of "there's a gotcha about this one", I like it.

There is one unresolved issue with this, which I can't imagine ever coming up in practise, but I have a solution for it, which for natural reasons is not at the top of the list atm. Guess what it is. :-)

Pragmas

Some pragmas usable with functions: 'inline, 'no-inline, 'returns-twice, 'no-return, 'naked, 'raises.

LLVM is very capable of inlining the right stuff for optimum speed (or size), so this should rarely have to be used.

There are also some pragmas for changing semantics: 'pure - this is not implemented yet though ;-) And I'm still thinking about better ways to express pure functions, in order to promote writing them.

Additional Notes

my-lambda = (x Int, y Int) -> x + y

x = my-lambda 3, 5
-- x = 8

my-fun-fun(f (Int, Int)->, a, b) ->
   f a, b -- call the lambda we got as arg `f` with args `a` and `b`

u = my-fun-fun my-lambda, 3, 4
-- u = 7

z = 47
my-closuring-lambda = (x Int) ->
   z += x
   z

y = my-closuring-lambda -5
-- y == 42
-- z == 42

Soft Lambdas Fragments

This is a special beast - see them in their own issue: #14

Aaaaand, as always: remind me of what I've forgotten or should clarify.

stugol commented 8 years ago

I think parentheses should be optional when declaring functions that take no arguments. Also, the optional end keyword should be disallowed unless using fn or def. Reason being, those keywords are redundant, and serve only as an explicit starter. end is also redundant, serving only as an explicit finisher. Therefore they should both be used, or neither used.

Suggest that fn mean "pure function", rather than an alias for def. The implicit variant could use =>, assuming that won't interfere with hash notation.

Suggest that -> is not used when fn and def are used. It's unnecessary and looks wrong.

->! syntax is a good idea. Explicit variation should specify Nil as the return type:

def myfunction() Nil
end

Mixing , and ; in a function signature makes no sense, and will only cause confusion. Should use , only.

What is the "unresolved issue" with the private/protected syntax?

What is "returns twice"?

Why is "raises" necessary? Surely the compiler knows that something can raise, because it directly or indirectly calls raise.

ozra commented 8 years ago

As for the parentheses in func-def I think that's a good formalia for consistency. Two parens aren't hard to type, for a func def, and it also helps making parsing more reliable (in guessing what actually went wrong when there's an error, etc.). The parsing is already a bit of a beast when it comes to func vs calls vs lambdas vs paren expressions vs soft lambdas... I make note of it though!

As for the rest of the suggestions in between, I agree on them at large, let's wait and see if there's some more input first.

The unresolved issue is for *(x) -> operator. Don't think a private such is very common. But it's easily solved via some separator or so: "`(x) ->" or something (gh-markdown can't handle the notation, so I made it a quoted string here).

'returns-twice and 'raises-pragmas are for interaction with c-libs where it can't be inferred since the c-funcs actions are opaque to Onyx. You can google them for more details (LLVM).

stugol commented 8 years ago

I've googled for "returns twice" and can't find anything useful :(

ozra commented 8 years ago

It's only used on "veery special occasions", setjmp, vfork, etc. Chances are you'll never have to think about it again.

ozra commented 8 years ago

Change to Func-def Syntax Positioning of Return Type

[slight edits made 2016-09-20]

Since Onyx is in spring autumn cleaning phase I want to narrow down the variations in the language and start getting it solidified, I want to get breaking changes done as early as possible.

This is another of those I've been thinking a while about, but not issued.

Proposition at a glance with a few examples

fn1(x, y) -> SomeType
   a = do-stuff
   b = SomeType a
   b

fn2(x Int, y Int) -> Int
    a = x + y
    a + 2

fn3(x, y) -> x + y

fn4(x Int, y Int) -> x + y

fn5(x Int, y Int) -> Int   -- ~~Gotcha: one-liner assums value after arrow: returns `Int`!~~
                           -- Change! Should simply error! "Do you want to return Int, or set return type too?"
fn6(x Int, y Int) -> Int: x + y  -- One-liner with return type Int

Pros

The arrow -> in function defs, along with the args is the tell tale beacon, pretty much ) -> is the strongly recognisable part. The only con is that one-lined functions require an additional nest-starter after ret-type, iff specified.

Currently:

fun(x) -> x
ret-typed-fun(x) Int -> x
fun-returning-a-type() -> Int  -- extremely unlikely in reality, without further code
ret-typed-fun2(x) Int ->
   x
fun-type = '(Int, Bool) -> Int

Proposed:

fun(x) -> x
ret-typed-fun(x) -> Int: x
fun-returning-a-type() ->: Int  -- extremely unlikely in reality -  ugliness acceptable
ret-typed-fun2(x) -> Int
   x
fun-type = '(Int, Bool) -> Int

And, of course you could write one-liners using any of the nest-starters, as usual, or even an expression delimiter (;):

fn1(x Int, y Int) -> Int: x + y

fn2(x Int, y Int) -> Int => x + y

fn3(x Int, y Int) -> Int do x + y

fn4(x Int, y Int) -> Int then x + y

Arguments and criticism against this change (or for) is highly welcome, please elaborate.

stugol commented 8 years ago

I'm okay with either style, to be honest. I'm not sure why you're calling new on SomeType though. Don't we just funcall it? = SomeType()

ozra commented 8 years ago

You're absolutely right, head got stuck in Crystal mode, haha. .new works in onyx too, but as you say, the preferred style is just "calling the type". I'll edit to make it idiomatic.

stugol commented 8 years ago

What would prevent a becoming equal to the type?

a = SomeType
ozra commented 8 years ago

A name of a type or lambda is considered value primarily, unlike functions and methods which are primarily calls. This means that either arguments or empty call-parens are required to instantiate.

a = SomeType  -- the value of a is the type
a = SomeType()  -- the value of a is an instance of the type
a = SomeType(x)  -- the value of a is an instance of the type
a = SomeType x  -- the value of a is an instance of the type

Did that make it clear (cause I'm not fully in the clear about the question ;-) )?

stugol commented 8 years ago

I think that might lead to obscure bugs. Given how rarely a type will be assigned to a variable, why not require special syntax?

a = SomeType   -- instantiation
a := SomeType  -- type assignment
type a = SomeType   -- alternative syntax
ozra commented 8 years ago

Thanks for pointing it out, it's also not consistent with funcs/methods. But there are also lambdas and functors which has the same behaviour as Type. It's a value primarily.

[ed: the below text stems from me thinking and typing at the same time, so it might be a bit un-orderly]

There is voluntary type prefixes (only required in certain situations, not even practical yet), could use that I guess:

a = 'SomeType

But... that fails one of it's main usage-ideas:

a-typed-var 'SomeType

Silly me - so of course another symbol is required. The prefix-method should definitely be the style in any event. Reminds me, I must look into the make a lambda-value from a function:

func(x) -> x
some-lambda = ->func(Int) -> Int  -- or something like that
some-type = ^SomeType  -- or something like that

But here then some-lambda (as mentioned) is considered a value, unless called with parens or args (same as type atm).

Allowing plain SomeType as instantiation still conflicts with constants. For that reason considering AnyPascalCased as compile-time-constant value (regardless if it's a type or other value) will be the simplest way. But that's not what we want in Onyx preferably. Differentiating between SomeConst and SomeType can probably be done almost as efficiently if going deeper in to the compile process (can't be solved at a syntactic stage) - I'll keep it in mind and thing about it a bit, because it definitely deserves "getting right".

So: Should lambdas - and functors for that matter - (even though they're values) be considered calls at all times, and always require ->-prefix to be seen as values? It's a possible way of going about it. Just a bit more involved.

Some examples for comparison, first: current syntax (except fun->lambd which I still haven't gotten around to verifying):

some-func() -> 7
type SomeType  -- an empty type, pretty useless, but enough for example
a-lambda = ->some-func() -> Int  -- convert a func to a lambda - _has to be typed_

another-lambda = () Int -> 5  -- lambda from lambda-literal

type MyFunctor: call() -> 3
a-functor = MyFunctor()

func-using-lambda(lamb) -> lamb() + 47
func-instantiating-type(typ) -> typ.new -- this would require even further complexity to resolve to new'ing from just 'typ'

func-using-lambda a-lambda
func-using-lambda another-lambda
func-using-lambda a-functor
func-instantiating-type SomeType

some-func  -- calls func
a-lambda() -- calls
another-lambda() -- calls
a-functor() -- calls
SomeType() -- instantiates

->some-func() -> Int  -- func as lambda-value
a-lambda -- value
another-lambda -- value
a-functor -- value
SomeType -- value

And with the proposed changes

some-func() -> 7
type SomeType  -- an empty type, pretty useless, but enough for example
a-lambda = ->some-func() -> Int  -- convert a func to a lambda - _has to be typed_

another-lambda = () Int -> 5  -- lambda from lambda-literal

type MyFunctor: call() -> 3
a-functor = MyFunctor  -- this would now result in instantiation

func-using-lambda(lamb) -> lamb + 47  -- parens not required on lamb (will require deep compilation juggling)
func-instantiating-type(typ) -> typ -- this would also require even further complexity to resolve to new'ing from just 'typ'

-- func-using-lambda a-lambda  -- would not do what was intended - a-lambda is now called at call-site
-- func-using-lambda another-lambda  -- would not do what was intended - another-lambda is now called at call-site
-- func-using-lambda a-functor   -- would not do what was intended - a-functor is now called at call-site
-- func-instantiating-type SomeType  -- would not do what was intended - SomeType is now instantiated at call-site
func-using-lambda ->a-lambda  -- would work, it's already typed, no need to do again
func-using-lambda ->another-lambda  -- would work, it's already typed, no need to do again
func-using-lambda ->a-functor  -- would work
func-instantiating-type ^SomeType  -- would work

some-func  -- calls func
a-lambda -- also calls
another-lambda -- also calls
a-functor -- also calls
SomeType -- now instantiates

->some-func() -> Int  -- func as lambda-value
->a-lambda -- value
->another-lambda -- value
->a-functor -- value
^SomeType -- value

So lambdas, functors and types would then follow the same pattern as functions/methods (of course with much simpler 'value'ing notation, since they don't need to be typed up)

It seems, in a way, the possibility of error is just moved to other areas? Is it a change worthwile?

This definitely needs some thinking about. Of course types are the primary interest here when weighing how it should be implemented - the others aren't very common: I just think lambdas/functors should follow along whatever decision is taken for consistency.

End finally: Foo.Bar.Qwo.some-func -> currently simple and accessible syntax to scope through name-spaces. With the change it will no longer be very clear whether it means Foo.new.Bar.new.Qwo.new.some-func or the previously described scenario.

ozra commented 8 years ago

After a quick smoke-break, my brain tells me it's better to improve error-messages if SomeType is used in a context where it reasonably should have been SomeType(), "error so and such bla bla - did you mean SomeType()?"

ozra commented 8 years ago

Further on the issue at hand: I've also been thinking about whether operators should be enclosed with delimiters in function-defs for clarity:

Now

ext I32
   *(x SomeType) -> this * x.value

Then

ext I32
   `*`(x SomeType) -> this * x.value

Then also functor call funcs could be defined as () instead of call (well it could anyway, but it looks a bit crazy!):

type MyFunctor
   `()`(x, y) -> say "{x} and {y}"

It also solves the...

`*`*(x) ->

...dilemma for protected/private (which in honesty isn't that much of a real-world problem, but...)

Another option is to allow this as alternative style, and it can be used explicitly only when required (notably only * and '**' with protected or private asterisks).

The the stylizer can render it as wished for repo vs private editing.

ozra commented 7 years ago

I want to either implement or dismiss the syntax change suggest in https://github.com/ozra/onyx-lang/issues/11#issuecomment-217818212, so that this basic construct of the language is finally settled and nailed down.

I think it makes sense and looks much more natural for the different cases of defining a function, with and without explicit return type, single or multi-line.

On the other hand, it does vary slightly with regard to the meaning of a Constish following the arrow, depending on if an indented function body follows or not. That's the only detail that nags me.

The variation problem has been solved. One-liner with return type must separate with nest-start-token or expression-separator (;) (semi colon won't work of course: can be ret-type; exprs; or exprs; exprs;...)

Sod-Almighty commented 7 years ago

Agreed. New syntax looks better.

One-line statements like fn5(x Int, y Int) -> Int should error out as suggested above, to prevent mistakes; but only when apparently returning a type. fn5 (x Int) -> x*x should not throw an error.

(the non-erroring code for fn5, if returning Int was actually intentional, would presumably be fn5(x Int, y Int) -> Class: Int actually, it's Type now, right? fn5(x Int, y Int) -> Type: Int)

ozra commented 7 years ago

fn5 (x Int) -> x*x should not throw an error.

Correct! That's business as usual.

Only when the arrow is followed immediately by what fits the rules of a type and nothing more, will it error. If a body follows after the possible type - it is considered a type. If it doesn't seem like a type after the arrow - it's considered body. It sounds fuzzy, but testing the alternatives, it seems to me there is no risk of accidental mishaps, since the only gray-area immediately errors. And the gray-area should be uncommon enough to not cause an annoyance in every day coding. Basically only if you have a bunch of one-line functions that return a constant only. But then again: MyConst = 1; foo() ->; MyConst. Not too hard ;-)

The "type of a type" is usually called "kind" in type theoretical lingo, so therefore they're named Kind in Onyx. And you're right: all these three would work (fn2 and fn3 of course inferring the type Kind):

fn1(x Int, y Int) -> Kind: Int
fn2(x Int, y Int) ->: Int
fn3(x Int, y Int) -> return Int