Open ozra opened 8 years ago
I think parentheses should be optional when declaring functions that take no arguments. Also, the optional end
keyword should be disallowed unless using fn
or def
. Reason being, those keywords are redundant, and serve only as an explicit starter. end
is also redundant, serving only as an explicit finisher. Therefore they should both be used, or neither used.
Suggest that fn
mean "pure function", rather than an alias for def
. The implicit variant could use =>
, assuming that won't interfere with hash notation.
Suggest that ->
is not used when fn
and def
are used. It's unnecessary and looks wrong.
->!
syntax is a good idea. Explicit variation should specify Nil
as the return type:
def myfunction() Nil
end
Mixing ,
and ;
in a function signature makes no sense, and will only cause confusion. Should use ,
only.
What is the "unresolved issue" with the private/protected syntax?
What is "returns twice"?
Why is "raises" necessary? Surely the compiler knows that something can raise, because it directly or indirectly calls raise
.
As for the parentheses in func-def I think that's a good formalia for consistency. Two parens aren't hard to type, for a func def, and it also helps making parsing more reliable (in guessing what actually went wrong when there's an error, etc.). The parsing is already a bit of a beast when it comes to func vs calls vs lambdas vs paren expressions vs soft lambdas... I make note of it though!
As for the rest of the suggestions in between, I agree on them at large, let's wait and see if there's some more input first.
The unresolved issue is for *(x) ->
operator. Don't think a private such is very common. But it's easily solved via some separator or so: "`(x) ->" or something (gh-markdown can't handle the notation, so I made it a quoted string here).
'returns-twice
and 'raises
-pragmas are for interaction with c-libs where it can't be inferred since the c-funcs actions are opaque to Onyx. You can google them for more details (LLVM).
I've googled for "returns twice" and can't find anything useful :(
It's only used on "veery special occasions", setjmp, vfork, etc. Chances are you'll never have to think about it again.
[slight edits made 2016-09-20]
Since Onyx is in spring autumn cleaning phase I want to narrow down the variations in the language and start getting it solidified, I want to get breaking changes done as early as possible.
This is another of those I've been thinking a while about, but not issued.
fn1(x, y) -> SomeType
a = do-stuff
b = SomeType a
b
fn2(x Int, y Int) -> Int
a = x + y
a + 2
fn3(x, y) -> x + y
fn4(x Int, y Int) -> x + y
fn5(x Int, y Int) -> Int -- ~~Gotcha: one-liner assums value after arrow: returns `Int`!~~
-- Change! Should simply error! "Do you want to return Int, or set return type too?"
fn6(x Int, y Int) -> Int: x + y -- One-liner with return type Int
fn5
).(Foo, Bar) -> Qwo
.The arrow ->
in function defs, along with the args is the tell tale beacon, pretty much ) ->
is the strongly recognisable part. The only con is that one-lined functions require an additional nest-starter after ret-type, iff specified.
Currently:
fun(x) -> x
ret-typed-fun(x) Int -> x
fun-returning-a-type() -> Int -- extremely unlikely in reality, without further code
ret-typed-fun2(x) Int ->
x
fun-type = '(Int, Bool) -> Int
Proposed:
fun(x) -> x
ret-typed-fun(x) -> Int: x
fun-returning-a-type() ->: Int -- extremely unlikely in reality - ugliness acceptable
ret-typed-fun2(x) -> Int
x
fun-type = '(Int, Bool) -> Int
And, of course you could write one-liners using any of the nest-starters, as usual, or even an expression delimiter (;
):
fn1(x Int, y Int) -> Int: x + y
fn2(x Int, y Int) -> Int => x + y
fn3(x Int, y Int) -> Int do x + y
fn4(x Int, y Int) -> Int then x + y
Arguments and criticism against this change (or for) is highly welcome, please elaborate.
I'm okay with either style, to be honest. I'm not sure why you're calling new
on SomeType
though. Don't we just funcall it? = SomeType()
You're absolutely right, head got stuck in Crystal mode, haha. .new
works in onyx too, but as you say, the preferred style is just "calling the type". I'll edit to make it idiomatic.
What would prevent a
becoming equal to the type?
a = SomeType
A name of a type or lambda is considered value primarily, unlike functions and methods which are primarily calls. This means that either arguments or empty call-parens are required to instantiate.
a = SomeType -- the value of a is the type
a = SomeType() -- the value of a is an instance of the type
a = SomeType(x) -- the value of a is an instance of the type
a = SomeType x -- the value of a is an instance of the type
Did that make it clear (cause I'm not fully in the clear about the question ;-) )?
I think that might lead to obscure bugs. Given how rarely a type will be assigned to a variable, why not require special syntax?
a = SomeType -- instantiation
a := SomeType -- type assignment
type a = SomeType -- alternative syntax
Thanks for pointing it out, it's also not consistent with funcs/methods. But there are also lambdas and functors which has the same behaviour as Type. It's a value primarily.
[ed: the below text stems from me thinking and typing at the same time, so it might be a bit un-orderly]
There is voluntary type prefixes (only required in certain situations, not even practical yet), could use that I guess:
a = 'SomeType
But... that fails one of it's main usage-ideas:
a-typed-var 'SomeType
Silly me - so of course another symbol is required. The prefix-method should definitely be the style in any event.
Reminds me, I must look into the make a lambda-value from a function
:
func(x) -> x
some-lambda = ->func(Int) -> Int -- or something like that
some-type = ^SomeType -- or something like that
But here then some-lambda (as mentioned) is considered a value, unless called with parens or args (same as type atm).
Allowing plain SomeType
as instantiation still conflicts with constants. For that reason considering AnyPascalCased as compile-time-constant value (regardless if it's a type or other value) will be the simplest way. But that's not what we want in Onyx preferably. Differentiating between SomeConst
and SomeType
can probably be done almost as efficiently if going deeper in to the compile process (can't be solved at a syntactic stage) - I'll keep it in mind and thing about it a bit, because it definitely deserves "getting right".
So: Should lambdas - and functors for that matter - (even though they're values) be considered calls at all times, and always require ->
-prefix to be seen as values? It's a possible way of going about it. Just a bit more involved.
Some examples for comparison, first: current syntax (except fun->lambd which I still haven't gotten around to verifying):
some-func() -> 7
type SomeType -- an empty type, pretty useless, but enough for example
a-lambda = ->some-func() -> Int -- convert a func to a lambda - _has to be typed_
another-lambda = () Int -> 5 -- lambda from lambda-literal
type MyFunctor: call() -> 3
a-functor = MyFunctor()
func-using-lambda(lamb) -> lamb() + 47
func-instantiating-type(typ) -> typ.new -- this would require even further complexity to resolve to new'ing from just 'typ'
func-using-lambda a-lambda
func-using-lambda another-lambda
func-using-lambda a-functor
func-instantiating-type SomeType
some-func -- calls func
a-lambda() -- calls
another-lambda() -- calls
a-functor() -- calls
SomeType() -- instantiates
->some-func() -> Int -- func as lambda-value
a-lambda -- value
another-lambda -- value
a-functor -- value
SomeType -- value
And with the proposed changes
some-func() -> 7
type SomeType -- an empty type, pretty useless, but enough for example
a-lambda = ->some-func() -> Int -- convert a func to a lambda - _has to be typed_
another-lambda = () Int -> 5 -- lambda from lambda-literal
type MyFunctor: call() -> 3
a-functor = MyFunctor -- this would now result in instantiation
func-using-lambda(lamb) -> lamb + 47 -- parens not required on lamb (will require deep compilation juggling)
func-instantiating-type(typ) -> typ -- this would also require even further complexity to resolve to new'ing from just 'typ'
-- func-using-lambda a-lambda -- would not do what was intended - a-lambda is now called at call-site
-- func-using-lambda another-lambda -- would not do what was intended - another-lambda is now called at call-site
-- func-using-lambda a-functor -- would not do what was intended - a-functor is now called at call-site
-- func-instantiating-type SomeType -- would not do what was intended - SomeType is now instantiated at call-site
func-using-lambda ->a-lambda -- would work, it's already typed, no need to do again
func-using-lambda ->another-lambda -- would work, it's already typed, no need to do again
func-using-lambda ->a-functor -- would work
func-instantiating-type ^SomeType -- would work
some-func -- calls func
a-lambda -- also calls
another-lambda -- also calls
a-functor -- also calls
SomeType -- now instantiates
->some-func() -> Int -- func as lambda-value
->a-lambda -- value
->another-lambda -- value
->a-functor -- value
^SomeType -- value
So lambdas, functors and types would then follow the same pattern as functions/methods (of course with much simpler 'value'ing notation, since they don't need to be typed up)
It seems, in a way, the possibility of error is just moved to other areas? Is it a change worthwile?
This definitely needs some thinking about. Of course types are the primary interest here when weighing how it should be implemented - the others aren't very common: I just think lambdas/functors should follow along whatever decision is taken for consistency.
End finally: Foo.Bar.Qwo.some-func
-> currently simple and accessible syntax to scope through name-spaces. With the change it will no longer be very clear whether it means Foo.new.Bar.new.Qwo.new.some-func
or the previously described scenario.
After a quick smoke-break, my brain tells me it's better to improve error-messages if SomeType
is used in a context where it reasonably should have been SomeType()
, "error so and such bla bla - did you mean SomeType()
?"
Further on the issue at hand: I've also been thinking about whether operators should be enclosed with delimiters in function-defs for clarity:
Now
ext I32
*(x SomeType) -> this * x.value
Then
ext I32
`*`(x SomeType) -> this * x.value
Then also functor call funcs could be defined as ()
instead of call
(well it could anyway, but it looks a bit crazy!):
type MyFunctor
`()`(x, y) -> say "{x} and {y}"
It also solves the...
`*`*(x) ->
...dilemma for protected/private (which in honesty isn't that much of a real-world problem, but...)
Another option is to allow this as alternative style, and it can be used explicitly only when required (notably only *
and '**' with protected or private asterisks).
The the stylizer can render it as wished for repo vs private editing.
I want to either implement or dismiss the syntax change suggest in https://github.com/ozra/onyx-lang/issues/11#issuecomment-217818212, so that this basic construct of the language is finally settled and nailed down.
I think it makes sense and looks much more natural for the different cases of defining a function, with and without explicit return type, single or multi-line.
On the other hand, it does vary slightly with regard to the meaning of a Constish
following the arrow, depending on if an indented function body follows or not. That's the only detail that nags me.
The variation problem has been solved. One-liner with return type must separate with nest-start-token or expression-separator ( (semi colon won't work of course: can be ;
)ret-type; exprs;
or exprs; exprs;
...)
Agreed. New syntax looks better.
One-line statements like fn5(x Int, y Int) -> Int
should error out as suggested above, to prevent mistakes; but only when apparently returning a type. fn5 (x Int) -> x*x
should not throw an error.
(the non-erroring code for fn5, if returning Int
was actually intentional, would presumably be actually, it's Type now, right? fn5(x Int, y Int) -> Class: Int
fn5(x Int, y Int) -> Type: Int
)
fn5 (x Int) -> x*x should not throw an error.
Correct! That's business as usual.
Only when the arrow is followed immediately by what fits the rules of a type and nothing more, will it error. If a body follows after the possible type - it is considered a type. If it doesn't seem like a type after the arrow - it's considered body. It sounds fuzzy, but testing the alternatives, it seems to me there is no risk of accidental mishaps, since the only gray-area immediately errors. And the gray-area should be uncommon enough to not cause an annoyance in every day coding. Basically only if you have a bunch of one-line functions that return a constant only.
But then again: MyConst = 1; foo() ->; MyConst
. Not too hard ;-)
The "type of a type" is usually called "kind" in type theoretical lingo, so therefore they're named Kind
in Onyx.
And you're right: all these three would work (fn2 and fn3 of course inferring the type Kind
):
fn1(x Int, y Int) -> Kind: Int
fn2(x Int, y Int) ->: Int
fn3(x Int, y Int) -> return Int
Functions / Methods
Note: See https://github.com/ozra/onyx-lang/issues/11#issuecomment-217818212 for latest RFC suggestion!
This is a staple construct of any language, and as such I believe there will be some opinions on it.
I like it to be simple and straight forward to define functions and therefore prefer as little formalia as possible while still making them clearly distinguishable and have a clear beacon (the spatial perspective). At the moment, there are two different ways of defining them, as to not lock in to my personal preference alone, keeping it open for discussion.
Note, I use lisp-style separators in the examples; all the below could of course (as already pointed out) be written with snake or camel.
[edit: I forgot about generics - see separate issue about it instead]
Prefixed with a keyword (most common in other current languages, some different keywords are allowed atm, they're all equal- support for this syntax has been dropped - it provides no benefit!fn
,def
, etc.)Thoughts concerning both
Thoughts about (1):
end
keyword, when used, may look a bit unbalanced towards 'identifier'Thoughts about (2):
Note: As mentioned above: this syntax has now been removed and deemed non-beneficial.
end
keyword is usedThe main reason the keyword style is so common is not because "it's better" - it's because it's easier to write a parser for it!
More Details, Common to Both
Using (1)-syntax in examples.
Sugar for "Functions" Not Returning Any Value
Since this is an imperative language, non (usable) value returning functions are not too uncommon (most often as part of types, where they modify member data of
self
only). In order to avoid accidentally leaking internal state involuntarily through implicit returns, and not having to tediously and repeatedly typenil
as final expression, an exclamation mark can be suffixed to the 'function-arrow'. Implying "action"/"command"/procedure/routine". This makes sure it ends withnil
and sets return type to Nil. Inspired by LS.Formal Parameters
,
(comma) or;
(semicolon), these separators can be mixed freely to make the signature as clear as possible.=
and an expression.@
it is automatically assigned to the member-var of that name. Very useful in constructors and setters of different kinds....
(inspired from C++, Java, JS-ES6, CS/LS) - it will be a tuple with all the matched args, it can be placed anywhere in the parameter list."Soft lambdas"Fragments can only be taken as last parameter, with an identifier prefixed with&
, atm.Visibility
After much pondering, I came to the conclusion that the best way to mark visibility is per function (as opposed to grouped), and as sleakly as possible: suffix the name with an asterisk for
protected
, two asterisks forprivate
. No asterisk meanspublic
. Public is the default because of the "open ness and patchability" philosophy of Onyx.GOTCHA: This is the opposite of Modula and Nim, where an asterisk designates public visibility and defaults to private.
The asterisks are only typed at definition, not in calls (they're not part of the name).
It sort of looks like a footnote, sort of "there's a gotcha about this one", I like it.
There is one unresolved issue with this, which I can't imagine ever coming up in practise, but I have a solution for it, which for natural reasons is not at the top of the list atm. Guess what it is. :-)
Pragmas
Some pragmas usable with functions:
'inline
,'no-inline
,'returns-twice
,'no-return
,'naked
,'raises
.LLVM is very capable of inlining the right stuff for optimum speed (or size), so this should rarely have to be used.
There are also some pragmas for changing semantics:
'pure
- this is not implemented yet though ;-) And I'm still thinking about better ways to express pure functions, in order to promote writing them.Additional Notes
Lambdas
my-lambda() -- call lambda without args
.Soft LambdasFragmentsThis is a special beast - see them in their own issue: #14
Aaaaand, as always: remind me of what I've forgotten or should clarify.