parapluu / encore

The Encore compiler.
BSD 3-Clause "New" or "Revised" License
43 stars 26 forks source link

RFC: Syntactic sugar for convenient operators #591

Open TobiasWrigstad opened 7 years ago

TobiasWrigstad commented 7 years ago

Taking a leaf out of Python's book, I propose (as we have discussed recently at some weekly meeting), that we support a mapping from some standard operators to methods with special names.

For [], I propose that we allow any type signature. That will allow a user-space implementation of maps with familiar syntax.

For < etc. operators, I propose that we require that the type of both operands are the same and require the return type to be bool, i.e., in a trait T, the only permitted signature of ge is:

def ge(other : T) : bool

Similarly, we could allow expr1(expr2) to desugar to expr1.apply(expr2) when expr1 returns an object, although I am not convinced we gain more than we add confusion.

op    type signature         desugars to
---------------------------------------
[]  : t1 -> t2 -> t3       | load
[]= : t1 -> t2 -> t3 -> t3 | store
<   : t -> t -> bool       | lt
>   : t -> t -> bool       | gt
<=  : t -> t -> bool       | le
>=  : t -> t -> bool       | ge
==  : t -> t -> bool       | eq
||  : t -> int             | size
---------------------------------------

UPDATE The return type of []= will be unit if assignments are statements. If they are not statements, we should add something (at some point) to protect against misspelled comparisons that accidentally turn into assignments.

TobiasWrigstad commented 7 years ago

Update, the [] and []= forms should be variadic, i.e., we should support

foo[x, y] = z

etc.

kaeluka commented 7 years ago
TobiasWrigstad commented 7 years ago

Thanks for the feedback! Re your second point, that's what I first thought too. But it seems strange that loops etc. have a return value and stores don't.

With respect to your second point, I completely agree. Here is what I wrote about this in the Slack channel (now obscured by subsequent messages) yesterday:

One interesting aspect of changing the syntax of type parameters to square brackets instead of angle brackets is that foo[i](j) can now, from the parser's perspective, be both a function call to a function named foo with type parameter i and argument j, and an function call on some unnamed closure in element i in the array foo with argument j.

Once we make this change, we will have to use types to disambiguate this, or find alternatives. Changing the syntax of passing type parameters to a function, for example. But I don't know what a good alternative is.

kaeluka commented 7 years ago

An alternative would be to use call syntax: x(0). It makes an array look like a function that accepts an index.

The reason for [] is of course that it's more popular. But if it leads to ambiguities...

kaeluka commented 7 years ago

Re your second point, that's what I first thought too. But it seems strange that loops etc. have a return value and stores don't.

I don't mean to sound stubborn: but it really doesn't seem strange to me at all. A loop produces something (including side effects), an assignment consumes something.

TobiasWrigstad commented 7 years ago

The reason for [] is of course that it's more popular. But if it leads to ambiguities...

It does not lead to any real ambiguities. However, parsing will become more convoluted.

Since types start with a Capital letter, and variables/methods won't, it will be pretty easy to see when the operand of [] is a type or not. The exception is type variables, but now we are talking very rare code IMO. Also, it is very likely that type parameters will not be needed in a majority of cases because of type inference.

For me, the reason to go with an expression-based syntax is that it is enabling. Is it so bad to let assignments return a value? (I don't feel very strongly about this, but I'm curious.)

kaeluka commented 7 years ago

The 'canonical' justification for making assignments void is that: if (x = true) then ... else ... end should not typecheck.

kaeluka commented 7 years ago

This typo is much more common in languages with weird semantics of booleans (like C), but the fact that it would be extra unlikely to happen in encore would make it extra hard to spot (you're off guard).

TobiasWrigstad commented 7 years ago

That hole seems easily pluggable by not allowing assignments in guards. (At least not in strict mode.) It would be better IMO to reject the program because of the probability that this is not what the programmer means, rather than a type error.

Also, if == is is...

kaeluka commented 7 years ago

Also, if == is is...

That'd make it better. I still think it's strange that assignments return things -- but that's a weak argument.

Does that mean that the table above (which contains ==) will change again? In a separate issue, I suppose?

TobiasWrigstad commented 7 years ago

I don't know if it will change. That's something I want to discuss soon (maybe this Thursday). And yes, separate issue.

On one hand, I like keywords over ASCII symbols. And it is nice if the eq operator desugars to a .eq() call. On the other hand, if x == y then ... feels more readable than if x eq y then ... because of how symbols are visually distinct from characters. Now that we've abandoned { and }, this becomes more important IMO.

kaeluka commented 7 years ago

I generally agree that keywords are nice. But liking keywords does not mean that we have to use them everywhere, always. I'll say that again whenever we discuss it for real.

TobiasWrigstad commented 7 years ago

+1 the question is just where (not)

supercooldave commented 7 years ago

The 'canonical' justification for making assignments void is that: if (x = true) then ... else ... end should not typecheck.

I'm not sure whether this is an argument for making assignments expressions (with void or unit type) or against it. Assignments won't have boolean type.

supercooldave commented 7 years ago

BTW: is there a difference between void type and unit type in Encore (or future Encore)?

TobiasWrigstad commented 7 years ago

I'm not sure whether this is an argument for making assignments expressions

I don't think Stephan cares -- his main point is that stuff like that should not work. Originally, we were discussing the convenience syntax x[y] = z which desugars to x.store(y, z). The latter is clearly an expression, and the question is whether is should return unit or something else (like return z).

Assignments won't have boolean type.

We are discussing the pros and cons of making assignments expressions, especially in the context of them being desugared into different syntactic forms. If, as I proposed above, x[y] = z is desugared into the method call x.store(y, z), then (if we allow it) the type may well be boolean (or a Maybe[t], which'd suffer the same problem). It won't be a statement.

is there a difference between void and unit

I don't think so, modulo syntax ofc. I think you wanted unit because of Haskell. For me it has been a search and replace.

I suggest we move the whether assignments are expressions or statements to a different thread because this really orthogonal to the RFC above. I am updating it with a note saying that []= will have a type signature that makes x[y] = z work like an assignment (i.e., if assignments are statements this should return unit, etc.).

supercooldave commented 7 years ago

Regarding the clash between [x] for type parameters and for array access, is it possible that a type T[x] and a term e[x] can appear in the same place, syntactically?

TobiasWrigstad commented 7 years ago

I don't think so. But T[x](e) and e[x](e) could (as already pointed out).

supercooldave commented 7 years ago

What does expression T[x](e) correspond to, where T[x](e) is a type?

TobiasWrigstad commented 7 years ago

Since we are dropping new, List[T]() is creating a list of Ts, and list[t]() is getting the teeth element of list, which is a closure, and calling it.

supercooldave commented 7 years ago

And the issue is that this syntactic ambiguity cannot be resolved until type checking, right? And to avoid unnecessarily complicating type checking, we'd rather not do such resolution in type checker?

TobiasWrigstad commented 7 years ago

And the issue is that this syntactic ambiguity cannot be resolved until type checking, right?

Yes.

And to avoid unnecessarily complicating type checking, we'd rather not do such resolution in type checker?

I honestly don't mind that part. It makes things a tad more complicated, but not much.

supercooldave commented 7 years ago

There is a little of that sort of type-based resolution in the compiler already (at least, in the modules branch). It is relatively simple to do so long as you can locally work out the types of the various components (sub-expressions) involved. Type inference might make that more complicated in some cases (though probably not here), because one doesn't locally know the solutions to the typing constraints.

TobiasWrigstad commented 7 years ago

That sounds great!

supercooldave commented 7 years ago

Will these operators be bundled together in one or more traits? These traits are likely to need F-bounded polymorphism to be useful. Will we support that?

TobiasWrigstad commented 7 years ago

The operators are like the methods for the for comprehensions. If store() is present in class C (by way of one of its traits, no matter which one) -- then you can write let c = C() in c[42] = bar. No requirement to implement all operators or specific subsets.

Unless someone has a better idea.

supercooldave commented 7 years ago

The same one I had for Stephan's comprehensions, namely, to use traits.

Without traits, you will need some kind of ad hoc structural polymorphism, or no polymorphism.

kaeluka commented 7 years ago

If we end up having polymorphism that's powerful enough, such traits might make sense here. But I doubt that we'll have that polymorphism. What Tobias proposed seems like a good tradeoff between effort and convenience. This is more like operator overloading.

kaeluka commented 7 years ago

For one, and AFAIK, we don't plan to have traits that have access to a Self type. This type would be needed here.

supercooldave commented 7 years ago

Or Traits with F-bounded polymorphism.

kaeluka commented 7 years ago

I don't see what the signature of, for instance, eq would be without referencing the type of this (the Self type).

supercooldave commented 7 years ago

Excuse my syntax.

trait Eq[t]
  require eq(that : t) : bool

class Dog : Eq[Dog]
  def eq(that : Dog) : bool
    this == that
kaeluka commented 7 years ago

And then I would be free to implement this? class Cat : Eq[Dog] { ... }

supercooldave commented 7 years ago

And then I would be free to implement this? class Cat : Eq[Dog] { ... }

Yes, I guess.

But you would not be able to use it where eq is expected.

For example, if I have a collection that requires the eq method, then its header would look something like this: class Collection[X <: Eq[X]]

Collection[Cat] would not be a valid type.

kaeluka commented 7 years ago

It's not a good design to have an Eq trait that allows for cats implementing equality to dogs in the first place. It makes no sense. Equality is an equivalence relation, and equivalence relations need to be reflexive. forall x: x.eq(x). Any type implementing equality to any other type is therefore bogus.

This has implications, for instance in the collection, and everywhere else, programmers are then required to keep restating the obvious: [X <: Eq[X]] is redundant, as X <: Eq[AnythingOtherThanX] makes no sense.

supercooldave commented 7 years ago

This is how it is done using F-bounded polymorphism.

What is your solution? In what formal system?

kaeluka commented 7 years ago

Option 1: No trait at all, just implement operator overloading that doesn't care about types (as this RFC suggests in my understanding)

Option 2: We could have traits that make the Self type available:

trait Eq
   require eq(other: Self): bool

It would make sense, but my understanding was that we will not add this feature. I also think that there are good reasons not to add it.

kaeluka commented 7 years ago

I think that you won't like option 1 very much. But consider that if we spend much time on the traits system, that would mean that other features would suffer.

If you look at Java's equals methods: they have a really imprecise interface (it takes Object), and people love to hate it -- but it does not cause real problems in real code (I love to hate it as well, btw).

An option 1 1/2 would be to emit a warning if a method called eq does take a parameter of the wrong type.

supercooldave commented 7 years ago

I agree that the Self option seems best, though I don't really know what the consequences of introducing it would be (for subtying etc).

Yet it won't be able to express reflexivity of equality, as was one of your criticisms of the F-bounded polymorphism approach. The constraint X <: Eq[X] would be repeated in classes that require equality to behave like equality, but it is not redundant, because it is not expressed elsewhere in the F-bounded polymorphism approach. It is simply formulated within the system in some way that cannot be formulated in Java, for instance. It follows a (not design) pattern, but is not ideal, and one should try to do better.

kaeluka commented 7 years ago

The RFC, as stated, can work without any typing support at all. If we agree that the current type system's abilities are too limited for doing it right, I'd suggest not doing it at all for now. This keeps the names available for a proper implementation in the future. It also keeps the proposed features of the RFC 'bite-sized'. IMO, a half-baked trait in the standard lib is worse than not having one at all in the standard lib.

supercooldave commented 7 years ago

Firstly, it seems that it won't work without typing support due to the syntax conflict. Secondly, one of the points of these discussions is to avoid any half-baked traits.

supercooldave commented 7 years ago

Scala supports F-bounded polymorphism. It's type system is pretty advanced.