eclipse-archived / ceylon

The Ceylon compiler, language module, and command line tools
http://ceylon-lang.org
Apache License 2.0
399 stars 62 forks source link

"is" operator and abbreviated types #3602

Closed CeylonMigrationBot closed 9 years ago

CeylonMigrationBot commented 11 years ago

[@gavinking] Back during the design of M1, we had a long discussion, with plenty of community involvement, where we decided that we should write the is operator expression in the prefix form

is Type expr 

instead of

expr is Type

in order to make it regular with if (is ... ). FTR, we can't write if (is ... ) in postfix form because the following just doesn't work:

if (var = complexExpression is Type)

(Since is has a lower precedence than = in the language.)

Anyway, this decision meant I had to sacrifice the ability to use type abbreviations with the is operator. The problem is stuff like this, for example:

Category []
Void (String)

which might be a type or might be a type followed by an expression.

At the time that was no big deal, since the only type abbreviations we had were T[] and T?, and they were easily enough written in other ways. Furthermore I didn't, at the time, support intersections, unions, and grouping in the Type.

However, times have changed, and we now support:

And what I have essentially ended up with is a parellel type expression grammar that especially handles the type expression that appears in the is operator—but not the type expression in if (is ...). I don't think this is an acceptable state of affairs.

If it was difficult to explain to newbies the difference between prefix and postfix is Type before, well it's going to be way more difficult to explain why they can write is <String[]> foo but not is String[] foo in certain obscure places in the language.

So we need to fix this.

Two possibilities are:

According to the second option, you would always write:

if (object is String) { ... }

Except when you need to introduce an alias like this:

if (is String string = sequence[i]) { ... }

I don't have a preference between these two options.

Finally, a different solution would be:

Unfortunately, it does not look like either of following abbreviations for function types would be possible:

(X,Y)=>Z
[X,Y]=>Z

Since they both result in ambiguities.

[Migrated from ceylon/ceylon-spec#496] [Closed at 2012-12-17 21:36:34]

CeylonMigrationBot commented 11 years ago

[@gavinking] Actually [X,Y => Z] would not work either, since it doesn't allow for multiple parameter lists. The only think I can think of that would really work would be something like <Z(X,Y)>—i.e. always consistently require that a function type appears inside angle brackets.

I don't like that solution.

CeylonMigrationBot commented 11 years ago

[@RossTate] What about disallowing whitespace in the type shorthands? Personally, I hate Category [] and Void (String).

CeylonMigrationBot commented 11 years ago

[@gavinking] @RossTate

That would certainly work, and I agree, I would never write Category [] or Void (String), but equally I've always hated languages which have some weird special whitespace-sensitive rule in to disambiguate some special case. It looks and feels like a hack. I suppose it would be reasonable in a language like Python where whitespace is generally meaningful, but not here, I don't think.

Especially not since our grammar is otherwise very clean.

CeylonMigrationBot commented 11 years ago

[@RossTate] I understand, but it sounds better than the alternatives. After all, we already had a big discussion on what people prefer, and you've just agreed that allowing whitespace enables unpleasant options, so with removing whitespace (generally, not just for is) we're both letting people have what they've already said they like and preventing people from doing things we don't like.

CeylonMigrationBot commented 11 years ago

[@gavinking] Well a heavy-handed solution is to just add a big ugly syntactic predicate to the grammar to squash the ambiguity, always resolving in favor of interpreting [] or (Type) as part of the type expression wherever that makes sense. It's pretty unlikely that is Type [] is ever meant to be testing the assignability of [], or that is Type (OtherType) is ever meant to be testing the assignability of the metamodel object OtherType (you would anyway just write it as is Type OtherType).

But in principle I hate this kind of syntactic predicate. In this instance I doubt that it will cause problems for the IDE, but I would have to check that. Syntactic predicates are very dangerous for the IDE.

CeylonMigrationBot commented 11 years ago

[@gavinking] Well, I've played around a bit, and for now this "big ugly syntactic predicate" seems like the least worst solution:

(DEFAULT_OP | ARRAY | LPAREN tupleElementType? (COMMA|RPAREN))=>

Solution applied in @a6186007dd897955d57e9423efd5c57559985f46.

I still don't like it, on principle, so if I can convince you guys to revert to the old syntax with the postfix is Type operator, I would prefer to do that.

CeylonMigrationBot commented 11 years ago

[@quintesse] Well to me the problem with the old syntax has always been that it was confusing. One format does narrowing but the other doesn't and nothing tells you which is which. So for new people it would result in very confusing situations where code that seemingly is the same in reality does different things. And I guess that even for not so new people it will trip them up from time to time.

So maybe option 2 would be okay too, if I understand it correctly at least: you would always use the infix, which would always narrow (unless used within an expression with && and ||) and only use the prefix when you need an alias.

CeylonMigrationBot commented 11 years ago

[@gavinking]

One format does narrowing but the other doesn't and nothing tells you which is which.

Well originally the rule was that prefix is Type narrows and postfix is Type doesn't. But then we changed it and to my mind that was the reason it became a bit confusing.

If we want to go back to how we originally had it, there would be no problem at all, since a prefix is Type would never be followed by something like [] or (OtherType) because those are not expressions that can be narrowed. That is to say, under this approach nothing would stop us supporting stuff like:

is String object then object.uppercased else "not an object"

The parsing problem I have is specifically with is String <some complex expression> where <some complex expression> is something that can't possibly be narrowed.

CeylonMigrationBot commented 11 years ago

[@gavinking]

That is to say, under this approach nothing would stop us supporting stuff like:

is String object then object.uppercased else "not an object"

Indeed, since the syntax is String object would not be an operator expression, it would be much easier to support this, and even stuff like the following expressions which define new values inline:

is String str = objects[i] then str.uppercased else "not an object"

exists str = strings[i] then str.uppercased else "null"

Since here the then/else could easily be parsed as part of the is construct.

Hell, given the above, I would be very sympathetic to introducing a general-purpose let expression, something like:

given total=sum(numbers) then "total: " total ", average: " total/numbers.size ""
CeylonMigrationBot commented 11 years ago

[@gavinking] Note: we would not implement support for given ... then ..., is ... then ... else ..., or exists ... then ... else ... in Ceylon 1.0. We would leave it for Ceylon 1.1. All we would do for now is drop support for the prefix is Type, exists, and nonempty operators, and support the postfix form instead.

CeylonMigrationBot commented 11 years ago

[@gavinking] Actually I think this is definitely what we should do. It cleanly ties together several loose ends we've had floating around:

3363, #3483, this thing <#3276#issuecomment-3893241> which has been mentioned many times in several discussions, and this current issue.

CeylonMigrationBot commented 11 years ago

[@gavinking] I have merged the m5syntax2 branch, so closing this issue.

CeylonMigrationBot commented 11 years ago

[@gavinking] FTR, I was thinking today that if we want = to have consistent precedence relative to other symbols across the whole language, which is definitely desirable, then that means that prefix exists, nonempty, is Type constructs must have a lower precedence than the postfix exists, nonempty, is Type operators. That is, in

exists x0 = xs[0]

The exists symbol has a very low precedence, lower than =. It's precedence is similar to the precedence of value in:

value x0 = xs[0]

OTOH, in

value b = x[0] exists;

the exists operator has a higher precedence than =.

So that's another reason to draw a sharp distinction between the two constructs.