qwertie / ecsharp

Home of LoycCore, the LES language of Loyc trees, the Enhanced C# parser, the LeMP macro preprocessor, and the LLLPG parser generator.
http://ecsharp.net
Other
172 stars 25 forks source link

Rename #of to 'of, and perhaps rename #tuple #96

Closed qwertie closed 4 years ago

qwertie commented 4 years ago

List!Foo, or List<Foo> in C#, is translated to a Loyc tree like #of(List, Foo), but isn't this an operator? If it is an operator, it should use the operator marker which is an apostrophe, and it should probably be renamed to 'of.

Also, multi-argument generics like Map!(String, Object) or Map<String, Object> are currently stored as a single call like #of(Map, String, Object), but the parser would be a bit simpler if ! were treated more like a normal binary operator: 'of(Map, #tuple(String, Object).

And what about tuples? It seems to me that the tuple constructor is operator-like, so it should be marked with ' instead of #. But is 'tuple an appropriate name? I like avoiding English names where possible to make Loyc trees more international; it could be called '() instead, but this wouldn't communicate the distinction between (x;) and (x). Since semicolon and comma are used in different languages to separate tuple items, neither '(,) and '(;) is a uniquely good name either, especially since both of them kind of look like they might represent 1-tuples instead of N-tuples... so I'm leaning toward 'tuple.

Similarly, 'of could be renamed to use punctuation only, but different languages use different punctuation for genericity and while X<T> is the most popular syntax, it isn't used in LES. Any thoughts @jonathanvdc?

qwertie commented 4 years ago

A bunch more things need the same treatment:

#new, #is, #as, #cast, #sizeof, #typeof, #default(T), #usingCast => 
'new, 'is, 'as, 'cast, 'sizeof, 'typeof, 'default(T), 'using,

Edit: the #new attribute should not change (because other attributes have similar names like #public, #static, etc.) but only the #new operator should get the operator prefix: 'new. At the lexer level, the lexer doesn't know which kind of "new" it's looking at; I decided arbitrarily for it to store 'new in the token. For #default, which is now lexed as 'default, I decided to let the switch label default: have the oddball representation #label(@'default). This is fairly arbitrary... given the earlier decision that default: was a "label", I could have reasonably stored it as #label(@default) or #label(#default). The reason to use #label(@'default) is that it avoids adding an extra branch in the EC# parser, and I didn't like #label(#default) as it would mean that the word default would have three representations across EC# and LES which is, like, a bit much.

The "alternate list" marker # is used to demarcate lists within constructs, such as argument lists and lists of base classes (#class(Foo, #(IFoo, IBar), {})). It seems like it is more like an "operator" than a "construct" but I think it's best to keep the name # so that the LES/EC# syntax remains easy to express - #class(Foo, @'(IFoo, IBar), {}) is rather awkward. Similarly, #splice seems sort of like an operator but it will normally be printed in prefix notation as #splice(...) and so will be a bit more readable if it continues using the # prefix.

C#'s checked and unchecked can be used like an operator or a statement and it seems to make sense to only have a single name for it, so... I guess the existing names #checked and #unchecked would be acceptable.

The new C# switch expression is written as a binary operator, and it is seems substantially different from the old switch statement. I suppose it makes sense to use #switch for the old one and 'switch for the new one... I have no firm opinion though, nor time right now to implement support for the switch expression.

qwertie commented 4 years ago

The suffix operators such as '++suf are unusual in using an English abbreviation to indicate meaning, and not having an appearance that matches its usage, i.e. since '++suf represents ++ after an expression, a better name would be 'suf++.

The existing names '[] and '_[] suggest a pattern to follow. The first means "list in square brackets", the second means "indexing operator". The underscore represents "something" so that _[] means "something followed by square brackets", hence foo[i] => @'_[](foo, i). Similarly I propose '_++ to mean "suffix ++", '_-- to mean "suffix --", and so forth.

Alternatively I could change '_[] to 'suf[] and '++suf to 'suf++ to produce a consistent pattern. But I'm leaning toward '_++.

qwertie commented 4 years ago

On second thought... I like 'suf++ because it is much more searchable. Finding suffix operators by looking for suf is going to find far more relevant results than _. Granted, it's a bit rare that you're looking for all suffix operators, but it happens. This implies '_[] should be renamed to 'suf[].

qwertie commented 4 years ago

These changes were made in v2.7.0.4.