Closed ozra closed 8 years ago
Types not extending a super-type look the ugliest. That's a petty, but the alternative is that the notation varies for that case. Consistency can be a plus. Especially since the notation is just the same for anonymous types then - the type names is just replaced with construction-arguments in parentheses.
I will update both sublime and atom highlighters to support Onyx all variations, including this, so it can be seen highlighted in one of those editors if you have access. I will wait until some cleaner ideas comes up for the tokens. There's room for improvement!
It would help if your example did not require a degree in engineering to understand.
Why all this end Material
malarkey? Optional explicit endings, I suppose. I wouldn't use them.
However, I really like the new module :
syntax. And the class syntax isn't looking too alien ;)
Really don't like ::
for classes though. The symbol is fine for scoping and such, but here it just looks wrong.
Why the redundant :
in a single-line class declaration? What is the current one-liner class declaration syntax? Do we even want to permit one-liner classes? And if so, wouldn't it make more sense to prohibit the terse syntax for one-liners, and mandate a colon?
type Any: +(other MyType) -> self + other.the-value
Or possibly <::
terse syntax?
Any <:: +(other MyType) -> self + other.the-value
Don't worry to much about the example :-) The notation is what matters, I just happened to cut and paste some pieces to whip up some context quickly.
Yes, end TheName
is optional explicitism for increased structural checking, I only use them for long typedefs (as always: stylizer can easily add or remove them according to personal choice - and the stylizer should soon be working alpha)
Regarding the "one liner", this is not a syntactic special case in Onyx, all indent constructs can be "one lined" by using "nest start tokens" (and they can be used even if indented of course). They are as mentioned before :
, =>
, then
, do
according to the user's choice. For functions the ->
beacon also doubles as nest-starter. In the example the indent is simply replaced with :
. If type-description blocks are always started with a beacon-doubling-as-nest-starter, like functions, using say ::
(which we both agree is a bad choice, though) then it's solved with standard notation at all times. Then, of course it is a special case, but not really, since it's standard type-def notation ;-)
One liner defs on types are very useful, it's not commonly for defining types as much as adding functionality (monkey patching). As per above example: adding addition-arithmetic generally, via one single line, for the cases where your specific type is on the right of the operator in an expression. (1 + MyType(1) == 2
)
In Onyx scoping is simply .
(why complicate things? ;-) ), but yes, that's what one commonly comes to think of "thanks" to C++ (and Crystal) etc. So yeah, some magically simple idea here is what needs to land B-)
<::
should be valid in any event, so I guess that would be fine actually. (It's just <:
followed by :
- no spacing is required)
The only thorn then is that I still find trailing <:
looks less than perfect. It doesn't sit just right, feels... unbalanced.
MyType <:
my-defs() -> blargh
Yeah, but to understand code, I need to already understand either the syntax or the context. And you're changing the syntax, so the context needs to be clear!
I still find trailing <: looks less than perfect
I see what you mean, but it will do, nonetheless.
Haha, good point.
Why does this have the breaking tag? Are we dropping support for the existing notation?
To not have variations on every little token for different approaches, there will likely be some adjustments also in the existing notation, it won't be removed. Those favouring the large beacons (the keywords) should have them.
After some sleep: this is why the "breaking" tag was added: type declaration is changed from style A (current) to style B (see #18), that is, currently the parsing of the type description is decided via a named "type-builder". After the change to B (and the terse-variations is a syntactic variation, using the same concept as B but ditching "type"-keyword) the type being extended decides how the description is parsed. If you extend "Enum", the type-description is parsed as an enum-description (which differs slightly from a common reference type in syntax: free constants are enumerated). If you extend "Value", your type will be a struct / stack-allocated. Instead of using value
type-builder in style A, or struct
in Crystal.
Yes, specifying value
and then a value base type was a little redundant.
I thought of a way to avoid the "trailing :>
". It's super logic. Super consistent. But, if it's good - that's another question, I'll throw it up in the air here for an additional angle of attack at least.
The most common type declared is a reference type. And very often without specific super type. Reference
is therefore implicit.
How about not making supertype implicit?
This does increase beacon to the level of using type
-prefix, but that is actually good in this case. Getting rid of type
-keyword is of most benefit for the cases: type MyEnum <: Enum
where MyEnum <: Enum
is clear enough and the keyword becomes redundant for identification. Declaring a common ref-type then would be: type MyType <: Reference
.
"Reference" however sounds kind of weird seeing it everywhere. So, ideas:
Any
is currently the top-type. (actually, there's even Type
above Any
)
Under that we have the branch to Reference
and Value
.
Either Reference and Value are renamed to some of:
Ref
, Val
Obj
, Val
Object
, Value
(Onyx:Object
would then correspond to Crystal:Reference
, Onyx:Any
to Crystal:Object
.)Trying alternatives out:
MyType <: Obj
@foo Int = 0
@bar Str = "Blargh"
init() ->
-- Single line monkey patching Any:
Any <: Type: +(other MyType) -> self + other.foo
-- Multi line monkeying
Any <: Type
-(other MyType) ->
self - other.foo
MyVal <: Val
@x Int = 0
@y Int = 0
init() ->
MyEnum <: Enum
A
B
Downside: Monkey patching a complexly extended type:
MyKzzxpmgfType[T1, T2, T3, U1, U2] <: Obj
@foo Tup<T1, T2, T3> = (U1, U2, 3.14)
init(...) ->
MySubT[U1, U2] <: MyKzzxpmgfType[Int, Int, F32, U1, U2]
-- nothing special here perhaps...
-- later on a monkey-patching
MySubT[U1, U2] <: MyKzzxpmgfType[Int, Int, F32, U1, U2]
get-first() -> @foo.0
-- that monkey-patching in "style A", sliiightly cleaner:
type MySubT
get-first() -> @foo.0
There is the alternative of actually using a keyword for 'redef`'ing - thereby making a monkey-patch explicit also (if the type is not declared already - it will error)
redef MySubT
get-first() -> @foo.0
redef MySubT: get-second() -> @foo.1
redef Any: +(other MyType) -> self + other.foo
Need to find a gelling solution here that addresses these scenarios in an intuitive way. Now there's at least a couple of more ideas to tear apart and put together into something hopefully one notch better than original proposition B + terse version.
After the worst ideas are weeded out I think I'll implement a wide set of possibilities, so they can be tested live for some time, then it will be clearer what feels right in practice, and what turned out to be crap ideas and can be ditched.
Object
and Value
are probably the least irritating of the lot, but I don't think we should use any of them. I shouldn't have to specify the base class when monkeypatching, neither should I have to specify a base class for an object that doesn't have one!
No, I think keep the trailing <:
for a multiline class, and use <::
for a single-line class.
A trailing <:
isn't too bad anyway. It's very similar to the trailing :
on a module. I don't see the problem.
You're right it's not "too bad", but comparing, the keyworded version looks a lot more aesthetic. The culprit is the <
+ :
combo of course. It's fine in some fonts, for instance Andale (which I use for it's enormous Unicode support, lacking in many monospaced fonts) in my editor is OK, but in Chrome's font it looks like crap - the edges of the angular in relation to the dots are completely mis-aligned, making it look like it's gonna explode). I think this is one of the reasons many are attracted to, say Ruby, the clean look in most display contexts: stemming from it's avoidance of such symbol combos on the most-part (but it pays a high price by yielding an extremely tedious bloated syntax).
type Foo < Bar
looks clean. type Foo <: Bar
, not as much - given the "badly designed fonts" predicate - which is all to common. Gudars skymning! I hate bad typography! Reality must be considered. Had all fonts been perfectly weighed, then no problem, but as it is...
Well. Currently indent-call sugar is not implemented on Const-like. So it could be used for other purpose. But. I think it's very good to continue to mirror regular call syntax for type-instantiation - the example with AST-nodes using indent-syntax is but one example of potential great use cases. So, it should be left for that.
Let's continue. Modules are pretty much perfect with the colon only, ah, happy break.
Back to the culprit.
An alternative (which isn't good...) is to continue to use <
only. Problem; the most commonly occuring typedef: implicitly extending Reference:
Foo < Bar
do-stuff() -> almost-all-good
Qwo <
do-stuff() -> not-as-good
The above is parseable, in the same way that the current PoC angular tuple notation is parseable (which will go away...) - it's possible, but with caveats, and it makes producing informed error messages harder. As long as the coder writes well-formed code, it's just additional job for the compiler (it has to try two parsing paths to figure out which is meant, it has to parse like a human - this is already common in Onyx, most languages are designed only to make parsing simple, no concern for human perception - bah!). The real problem arises when one makes an error. The compiler: "Was it a line-split 'less-then' with constant on left that errored, or was it a type-def that errored?". A heuristic approach can solve that, and since named types are always declared top level, and there is never a practical case for a x < y
-expression in a non returning position - we'll know that it actually, most-most likely, is a type def, and can then produce an error message that will be leading the coder the right way, 99.99% of the time (give or take +/-X% for on-the-spot-made-up-statistic).
So above is in practice pretty much fuckin' Aye OK! But! ...There's still a problem. The human parsing! The brain will (maybe...) have the same problem of discerning between type definition and comparison. Not too nice if such is the case. The contexts they appear in are however very different. Let's throw in some spatial intimacy for a visual comparison:
MyConst = 47
Foo <
do-stuff(a) -> MyConst < a
Bar <
@a = 0 'get
init(@a) ->
do-stuff(a) -> MyConst < a
Any <: +(other Bar) -> self + other.a
Last and most sober thought is of course that the fucking <:
will have to do, but those of us swearing over it use a Unicode alternative in our own "private workdir style". So I'll repeat above, for even more perspective, with the mid-dotted version (see further down also):
MyConst = 47
Foo <·
do-stuff(a) -> MyConst < a
Bar <·
@a = 0 'get
init(@a) ->
do-stuff() -> MyConst < @a
-- spaced colon visually better with my web font (while Andale is fine):
Any <· : +(other Bar) -> self + other.a
This looks good. The <·
(or ⊢
below) gives a much stronger "it's a typedef"-beacon. Minimizing the strain on the brain for keeping track of what is what. That's also more important than aesthetics. It's a correct synonymous symbol for <:
in type theory notation, so the understanding is also "more universal", then common constructs in programming languages.
Examples - which may look better or worse depending on font (which continues the original problem - as long as there are multiple code-points in combo, but even then for Unicode, even one symbol can misalign, further exacerbating the problem :-/ )
-- this also correct type relation notation, mirroring theory - the single dot
-- is less likely to mis-align badly than the colon
foo <· bar
do-stuff() -> alright-right-oooor?
-- this is not really a semantically correct use of
-- symbol, but looks clean (on _my screen_...)
foo ⊢ bar
do-stuff() -> alright-right-oooor?
Mid-dot is incidentally already confed on my keyboard on "alt.gr+dot".
Well, there, even more junk thrown into the cauldron, let's see what will cook.
I'm meanwhile prioritizing other points on the roadmap until this gels.
I like the <·
option. Although some fonts - such as the one used by this edit box! - misalign it :(
Best keep the <:
option as well.
I don't like <· :
for one-liners. Should be <··
. I also don't like ⊢
- it's abstract, and doesn't suggest classes to me. Acceptable alternatives include ⋖
and ⩹
.
Macs don't have an altgr :(
In thread:
In edit box:
Yes, it seems like <:
and <·
will be the primary candidates then - they both also has the correct semantic meaning "in maths", which is a bonus.
Allowing <··
would be to side step standard syntax, I don't think exceptional syntax rules should be added without very good reason. Sugar is added when warranted, but the target - especially for structurally affecting tokens - is to have as few and as clean generically re-usable rules as possible.
That said, I see the suggestion being worth keeping open for consideration.
The composite symbols can be evaluated for support in fonts and look on common setups. On my laptop they're pretty indecipherable (because of small screen / resolution) :-/
But <· :
looks bad. And people - such as me - will forget the :
and wonder why it's not working.
Yes, the more I try the alternatives in code, it becomes more and more clear to me that the terse module syntax rocks, and the terse type syntax sucks.
...
This comment and follow-ups moved to https://github.com/ozra/onyx-lang/issues/18#issuecomment-213474272.
Terse module notation is now implemented.
Since terse-notation overall turned out to generate one great idea, terse module-def, which is now implemented, and the rest being collateral, I consider this issue close now, all type-notation discussion will go into #18.
This sprung out of thinking in #18.
The idea is that, for high performance coders, beacons need not be screaming, these coders navigate the code space more easily, and so the important identifying properties of code (non formal parts) are more important: identifiers and the like, unnecessary keywords gets in the way for this category of people.
For those coders with a slightly lower spatial capability, science seems to show that they might feel "visually lost at sea" with minimal beacons - they need bigger trunks to "visually hold on to". For that reason the syntax can be used with type-defs, module-defs etc. prefixed with keywords - as is common in most all languages today. In other words these can be magically inserted/removed with the stylizer, so they can be on in repo-style, but off in you work-cache-dir where you edit.
There's an alternative example of the first case as a last example with just a slight variation which looks better for certain cases.
Disclaimer: this is show-case pseudo code, naturally not tested.
First off, the Efficient Minimal Formalism Syntax:
The Common Keyword Prefixed Style
Terse Style - A Variation
The trailing
<:
looks funny. So it might be better to have<:
mean extend / show subtype/supertype relation, and::
(or something else!) to designate beginning of type description body. I don't like the double colon really - it might cause confusion in relation to other languages, in any event the token must be distinct from single colon and arrow.This way:
Otherwise:
Compare Keyword Prefixed:
Compare Crystal:
The full blown show-case again:
The terse/clean style is surprisingly distinct in its' expression: I hacked together a grammar for sublimetext in about 30 minutes, and baam, immediately you could "jump to definition", etc. in the editor (and highlighting of course).