Open chochos opened 12 years ago
Nice idea! I have also noticed in the past that something like that could be useful, and we recently had problems in that direction:
https://github.com/ceylon/ceylon.language/commit/e6210303ed21b264416c2e2d2ca152e8c023e558
Snazzy! I think this in combination with the restriction I mentioned in that thread would be awesome.
There's actually a second bit of this proposal that @chochos forgot to write down. The idea has two parts:
given
subclauses to declare locals, andThe second bit would let you write:
given (i:=0) for (x in xs) i++ if (exists x) i->x
Which fixes what tripped us up in the definition of Iterable.indexed
.
Right, forgot about expression between subclauses. Well that's the whole idea right there.
First, I'd rather use the consecrated keyword for this and allow multiple vars to be defined:
let (i:=0;j=foo()) for (x in xs) something(x,i++)
With a semantics like letrec
where each binding can see the other allowing you to define things like:
let (a:=() b();b=() a) for (x in xs) something(x,i++)
And I'd even make that available outside of comprehensions as an expression, and what the hell allow statements in it too:
value foo = let (a = f(); b = g()) { a * a + b * 1/b }
Mmm, that wouldn't be too useful if we could define inline statements like that:
value foo = { a = f(); b = g(); return a * a + b * 1/b; }
Which would be equivalent to:
// define a lambda that contains statements, which we call immediately
value foo = (function (){ a = f(); b = g(); return a * a + b * 1/b; })();
But that brings me to something that's been bothering me about comprehensions for a while:
for
and if
have completely different semantics depending on whether or not they are used as a statement or in a comprehension. I wouldn't see any problem if for
and if
could be used inside expressions, as their semantics would be very similar. But for an expression if
we use the combination of then
and else
, which is already a bit irregular (test() then foo() else bar()
could be written as if test() foo() else bar()
or a variation with punctuation to make it more regular with the statement equivalent).
Now, I love comprehensions but I find it disconcerting and irregular that within them, the meanings of for
and if
are mapped not to their respective statements but to map
and filter
. I don't really have a better idea though, but I thought it'd be something worth noting, as regularity is something we care about.
Also withing a comprehension, the procedural rules stop applying: no need for brackets anymore, or semicolons, what looks like statements are expressions and they return values and the absence of else
for an if
has an entirely different meaning as we're suddenly building lists and mimicing the behaviour of map/filter
(which again, is pretty cool). I have the feeling that comprehensions are a different language altogether with its own syntax. Adding new stuff to it, like local bindings makes it even more complex and even more alien to the rest of the language.
Don't take this as just me randomly bashing something: I love comprehensions and I think we should keep them and possibly extend them with local bindings, but I feel the syntax is too alien to the rest of Ceylon and I am afraid we're introducing a language-within-a-language syntax.
That statement in particular looks entirely like a different language:
given (i:=0) for (x in xs) i++ if (exists x) i->x
Would it make any sense in allowing for
, while
, if
, switch
and blocks in expression contexts and give them an intuitive meaning that would allow us to have something even more powerful than comprehensions?
Rewriting this statement with a more Ceylon-like syntax:
value foo = { variable i:=0; for (x in xs){ i++; if (exists x) i->x; }}
Which would even be valid when used as a statement:
variable i:=0;
for (x in xs){
i++;
if (exists x) // well, if we allow single-statement ifs to drop braces ;)
i->x;
}
It's just an idea, but I have a feeling that would be more powerful, intuitive and regular.
What's the return type of an expresion block? Its last statement's return type, or that of its 'return' statements. Return type of a for
or while
expression is again its last statement. Return type of an if
is union of last statement of then
/else
blocks.
Wouldn't something like that work?
I'm fine with let
here. Don't love the
defining-multiple-variables-in-one-let
bit just because we don't let
you do that in other places.
I also agree that if we introduce this feature in comprehensions, then we'll also need to explore where else in the language it would be useful. For example, regularity suggests the following:
let (x = something()) {
//x is defined just in this local scope
}
Now, I love comprehensions but I find it disconcerting and irregular that within them, the meanings of
for
andif
are mapped not to their respective statements but tomap
andfilter
.
FTR, that's not actually how it works under the covers. The idea of mapping to map()
and filter()
had problems because we allow products.
Would it make any sense in allowing
for
,while
,if
,switch
and blocks in expression contexts and give them an intuitive meaning that would allow us to have something even more powerful than comprehensions?
I spent a lot of time thinking about this, on several occasions, and just never came up with anything satisfying:
for
expression could return, depending upon what you want to use it for (which is the reason why comprehensions are function arguments)while
evaluate to?!if
has a reasonable interpretation in an expression context, but if (foo) bar else baz
is just visually horrible and so asymmetric—especially for something I use _all_thetime.switch
is a very reasonable thing to support, but once you start having 3-branch conditionals inside an expression, the code can pretty quickly become difficult to read.try
works, but how often do you want to do expression handling inside an expression?
for
andif
have completely different semantics depending on whether or not they are used as a statement or in a comprehension.
I think of them as actually very similar. For example, if
means "if this condition holds (at run time), then include this statement/expression here (at run time)". They get implemented slightly differently and they have slightly different syntax, but to me they are very much akin. (@gavinking, note that this interpretation gives a semantics for while
in comprehensions.)
As for the curly braces, the sole purpose of curly braces is to disambiguate nesting. With comprehensions we have decided to optimize for what we expect to be the common case: everything nests. That is, everything after a for
or an if
is inside the for
/if
. This lets us and implies we should get rid of curly braces since there is no ambiguity. Of course there is a cost to this: we can't express comprehensions that require more complex nesting behavior. In particular, this includes things like else
and switch
.
I do find it weird that we get rid of semicolon after nested statements though. In fact, I'm a little worried that'll prevent us from using any syntax that relies on the presence of a semicolon in normal statements to disambiguate.
I agree with @RossTate—I don't think it's fair to say they have different semantics at all.
Hey, if we allow
let (x = something()) {
//x is defined just in this local scope
}
why not go all Haskell-y and allow
{
//x is defined just in this local scope
} given (x = something())
as well? :)
So the tangential discussion on this thread fits very much with the tangential discussion on #363. Ceylon today, like most mainstream languages, is "statement-oriented" and optimized for methods with multiple statements. Now I personally just love to write methods/getters with just single-expressions wherever that is reasonable, and for that reason I kinda like languages which are more "expression-oriented" in their syntax. Which is why we have stuff like comprehensions and then
/else
, I guess.
Up until now, I have not been that keen on introducing something like:
function name() => person.name.first + " " + person.name.last;
because once you try to refactor out common sub-expressions, you need a big change to the syntax (sure, the IDE can do this for you, but still). So it's not a syntax that "scales". It's an abbreviation that winds up getting in the way when you start maintaining the code.
On the other hand, let
in expressions didn't make much sense either, since you could just refactor out the common subexpression to a statement:
function name() {
Name n = person.name;
return n.first + " " + n.last;
}
However, if we have both these features, then we have something that makes sense. It becomes possible to write the above in the following "expression-oriented" form:
function name() => let (n = person.name) n.first + " " + n.last;
So then we need to answer the question: is this an improvement over the "statement-oriented" form, and is having two ways to write the same thing a good or a bad thing in this case. I'm interested to know what you guys think.
I personally find the definition with =>
and let
rather nice to look at. But it's hard to argue that it's objectively better when it involves really about the same number of tokens. Still, even if counting tokens can't explain my like for it, perhaps there's some other explanation...
@quintesse FTR, I actually prefer the keyword given
in both locations, for how it reads in English. I'm not especially keen on the postfix given because it's an irregular syntax compared to everything else we have in the language.
Sure, I wasn't actually being serious, especially because in Haskell it's used to even define localized functions where it makes more sense to put them after the "important" work.
PS I'm not sure how I feel about the short-cut syntax yet, but it sure makes me want the other Haskell syntax of being able to do:
Integer fib(0) => 0;
Integer fib(1) => 1;
Integer fib(n) => f(n-1) + f(n-2);
hehe
what the fuck should
while
evaluate to?!
An iterator.
if has a reasonable interpretation in an expression context, but
if (foo) bar else ba
z is just visually horrible.
If punctuation for expressions is optional (like for curly braces, then surely for single braces too: if foo else bar
I don't think it's fair to say they have different semantics at all.
Perhaps you're right, but it's still a language-within a language. Again, perhaps that's not an issue at all, but I wonder if it really isn't, and if we can't do something to unify both.
I think the syntax you are pointing to depends on your background. For me it's C, so even in languages like Scheme or JavaScript where both alternatives are possible I find myself writing:
function name(){
var n = person.name;
return n.first + " " + n.last;
}
Rather than:
var name = function(){
var n = person.name;
return n.first + " " + n.last;
}
Same in Scheme with:
(define (name)
(let ((n person.name))
(append n.first " " n.last)))
rather than:
(define name (lambda ()
(let ((n person.name))
(append n.first " " n.last))))
Personally I really have trouble making any sense of function name() => person.name.first + " " + person.name.last;
but just turning it into function name() { return person.name.first + " " + person.name.last;}
makes it readable for me. But I really think it's a background thing, more about habits than quantifiable things.
If punctuation for expressions is optional (like for curly braces, then surely for single braces too:
if foo else bar
Eh? I can't even hazard a guess what if foo else bar
might mean... I assume that's a typo?
You're going to need some kind of punctuation to separate the condition from the first expression. So you have a choice between if (foo) bar else baz
which is just awfully asymmetric, and if foo then bar else baz
, which is already different to the traditional C syntax, or you could go back to something more like C's ternary operator syntax, foo ? bar:baz
or if (foo) bar:baz
. Personally, I think then
and else
are superior to any of these options.
Eh? I can't even hazard a guess what
if foo else bar
might mean... I assume that's a typo?
Hah, yes of course, sorry: if foo bar else gee
@FroMage Again FTR, I have a rather extreme aversion to:
value sqr = (Float x) x*y;
To the extent that I think it's almost unfortunate that Ceylon lets you write this. The fact that this even works is more of an unintended consequence—the intersection of two language features that aren't really designed for use together—than an intentional feature. I suppose that everyone here would agree that what we actually want people to write in this case is just:
function sqr(Float x) { return x*y; }
So, indeed, this might be the strongest argument yet for supporting the form:
function sqr(Float x) => x*y;
i.e. if we don't support that Dart/Coffeescript-style arrow syntax, then some people are going to want to use the "value = anon function" form like you quite often see them do in the Scala community. (Apparently in the world of Scala, this idiom is not frowned upon.)
And function f() => e;
would just be a parser alias for function f() { return e; }
?
@FroMage Right. I guess it's something I could implement in the parser in like 5-10 mins, though I might not do it exactly like that. Supporting let
would be a little more work, and would impact the backend, but it's still pretty straightforward, I suppose. (Though I have not put a huge amount of though into it.)
So then we need to answer the question: is this an improvement over the "statement-oriented" form, and is having two ways to write the same thing a good or a bad thing in this case. I'm interested to know what you guys think.
My opinion: it looks kind of nice, but not soo much nicer than the current syntax. And having two ways to write the same thing as a bad thing in this case.
An important goal of Ceylon, according to the home page, is readability. If I interpret this goal correctly then it should lead us to a syntax that is as homogeneous, regular and easy to learn as possible. I think if we take that goal seriously then we can only introduce a new construct into the language if it solves a real, significant new problem. And I just don't see a sufficient benefit in this case.
i.e. if we don't support that Dart/Coffeescript-style arrow syntax, then some people are going to want to use the "value = anon function" form like you quite often see them do in the Scala community. (Apparently in the world of Scala, this idiom is not frowned upon.)
So we have two ways of expressing the same thing but one of them is not so nice. So we introduce a third possibility to lure people away from the one we want to hide ;-) Ok, I understand the reasoning and I also don't really like the current syntactic duality of functions. It would be great if we could find a way to improve that. But I don't think that this is the right way.
@ikasiuk I agree with all your points, but note that there are a couple of different factors pushing in this direction:
function sqr(Float x) { return x*x; }
is a little heavy on the eyes compared to some other recent languages like coffescript and dart, andvalue sqr = (Float x) x*x;
I don't think that 2+3 on their own would be enough to motivate me to want to make this change. But if other considerations are "pushing" in the same direction—that is, if it just feels like the language naturally want to grow in that direction—well, then that's a different matter...
But I don't think that this is the right way.
Can you think of a different syntax for partial application of a function / generic type? If we came up with something natural, then that might be a different approach that would give many of the same advantages. The trouble is, when I think about that problem, I start coming up with stuff like:
function sqr(Float x) = Integer.power(*)(2);
which I think is just not at all ceylonic (it's actually just Scala's horrible underscore in disguise), and is actually much less flexible than the fat arrow.
I start coming up with stuff like:
function sqr(Float x) = Integer.power(*)(2);
which I think is just not at all ceylonic.
Note that according to the language spec as it exists today, I should be able to write a general-purpose function named shuffle()
that would let me define sqr()
like this:
function sqr(Float x) = shuffle(Integer.power)(2);
i.e. shuffle()
swaps the first and second parameter lists of the higher-order function Integer.power
, letting you apply the second argument list (the method arguments) before the first (the method receiver).
Now, that's all very cool and powerful but I think it's exactly the kind of thing we don't want people to be tempted to use in "everyday" code.
Hah, yes of course, sorry:
if foo bar else gee
This can't be parsed, consider stuff like:
if x + y + z ...
if x { y } ...
We can't tell where the condition ends, at least not with finite lookahead in a CFG.
Two remarks: I prefer given
over let
, and I don't share the aversion against value f=(Float x)x*y;
(although I personally prefer the alternative).
And one question (I probably just missed that): Why do we need to introduce a new symbol =>
instead of just using =
?
And one question (I probably just missed that): Why do we need to introduce a new symbol
=>
instead of just using=
?
The explanation is in #363, which is actually where most of the explanation for the =>
is.
Note that let
is the keyword used in JavaScript as well, so it's well-known.
Note that
let
is the keyword used in JavaScript as well, so it's well-known.
let
in JavaScript is apparently not widely supported, so "well-known" might be exaggerated. And in general it seems that in Ceylon we tend to prefer the keyword that fits best, rather than what is most popular in other languages.
The explanation is in #363, which is actually where most of the explanation for the
=>
is.
Ah ok, thanks. I agree that =>
makes sense for type aliases because they are too different from simple assignment. So =>
would basically mean "alias". But I'm still not sure about using it for functions. Assuming that we introduce =>
for types and functions, how exactly would it work?
class B<X>(X x)=>A<X,Y>(x,2);
?void
functions, right? (void f()=>g(3);
)Float x => retrieveX()
assign x => storeX(x);
function f(...)=g(...)
form, or are there cases where that would also still be allowed?No need to state
value
orvariable
since it can be inferred from the declaration:=
means value,:=
means variable.
That reminds me that the variable
and :=
in something like
variable Integer i := 0;
always feel somewhat redundant. Could we say that an attribute that is initialized with :=
is always automatically variable
even if that annotation is omitted? Maybe we could use the same rule as for type inference and allow that only for non-shared
attributes.
I talked about that with @gavinking once. I think the idea is that if you must be sure that you want mutability, that's why you have to type that long keyword even if it seems redundant. That's the part about encouraging immutability - if we only leave :=
then it's not so clear that you are declaring a variable and not an immutable value.
I talked about that with @gavinking once. I think the idea is that if you must be sure that you want mutability, that's why you have to type that long keyword even if it seems redundant. That's the part about encouraging immutability - if we only leave
:=
then it's not so clear that you are declaring a variable and not an immutable value.
I agree for shared
attributes: the interface of a type shouldn't be defined by such implicit mechanisms. But I don't see a problem with non-shared
attributes. It's rather unlikely that you erroneously write :=
instead of =
and then actually produce a situation where that causes any harm. After all, we also allow type inference although we can't check if the inferred type is really the one intended by the user.
Now I know what felt wrong about the proposed =>
for functions: it's inconsistent unless we make the same change for anonymous functions. If we write a function that consists of a single expression as
Float sqr(Float x) => x**2;
then we are more or less forced to write anonymous functions in the same way:
values.map((Float x) => x**2);
That actually looks surprisingly readable.
Mmm, that wouldn't be too useful if we could define inline statements like that:
value foo = { a = f(); b = g(); return a * a + b * 1/b; }
There are a couple of problems with that. Firstly we would probably have to use [
/]
for sequence literals instead of {
/}
, to avoid ambiguity (that's not necessarily a problem of course).
Using return
could also be rather confusing if the statement appears in a function. In Scala a block of statements simply represents the value resulting from the last statement in the block. I'm not a big fan of that because it often doesn't read very well. Another possibility would be a "secondary" return statement (e.g. yield
).
But to be honest I don't think this kind of block expression is desirable at all because it makes the code structure less homogeneous and reduces readability.
Would it make any sense in allowing
for
,while
,if
,switch
and blocks in expression contexts and give them an intuitive meaning that would allow us to have something even more powerful than comprehensions?
I think that wouldn't only be very difficult but also wouldn't fit into the language very well. But that doesn't mean that a slightly more generalized mechanism for comprehensions is impossible:
The first problem is that the concept of a comprehension isn't represented optimally in the type hierarchy. A comprehension produces an Iterable
. But when passing a comprehension to a sequenced parameter it behaves differently than when passing an Iterable
directly: a comprehension is basically an Iterable
plus ...
. That's not a big problem as long as comprehensions can only be used for sequenced parameters anyway. But if we want to make them more general-purpose expressions then that's something we have to address.
Possible solution: a comprehension represents an object of type Comprehension
, and Comprehension
is a simple marker interface derived from Iterable
. If the static type of a value passed to a sequenced parameter satisfies Comprehension
then it is automatically used as an argument sequence, otherwise it's used as a single argument unless ...
is applied.
The generalization of the comprehension mechanism could be achieved by introducing three new operators:
for (...) expr
: represents a Comprehension
which evaluates the expression for each input element, skipping resulting null
values.if (...) expr
: replaces the then
operator, i.e. evaluates to null
if the condition is false
.let (...) expr
Of course these operators would not be allowed as the beginning of an expression statement, to avoid confusion with the for
and if
statements. If the operator precedence is chosen accordingly (for
/let
< else
< if
) then they can completely replace the current comprehension mechanism:
value comp = let (i := 0) for (e in elems) if (exists e) ++i->e;
Unfortunately this doesn't allow expressions between subclauses, so people might end up abusing the let
operator for that purpose: let (dummy = ++i)
. A simple solution may be to allow the following:
value comp = let (i := 0) for (e in elems) let (++i) if (exists e) i->e;
This obviates the need for the elements
function, as seen above (because a Comprehension
is an Iterable
). And of course we can also still write:
value seq = { let (i := 0) for (e in elems) let (++i) if (exists e) i->e; };
Another big advantage is that we can use the individual operators independently in expressions and in particular with the proposed =>
syntax for functions.
And we can directly use comprehensions also for parameters of type Iterable
, not just for sequenced parameters.
While there are aspects of your proposal I like, making Comprehension
an actual type has some issues, primarily with ambiguity. For example, consider the following code (assuming printLnEach(Object...)
is defined):
value seq = for (i in 1..3) i * i;
printLnEach(seq);
If we infer type Iterable<Integer>
for seq
, then this'll print 1\n2\n3\n
. If instead we infer type Comprehension<Integer>
for seq
, then this'll print {1,2,3}\n
. So now type inference affects semantics. In general, this issue has to do with behavioral subtyping, a property we'd like but which is incompatible with Comprehension
as a specially handled type.
@RossTate: Well, strictly speaking it's not ambiguous because obviously the inferred type would always be the same (and well-defined). But I do see that making the behavior depend on the static type of the expression could cause problems.
But it is clear that expression-based comprehensions would require us to solve that problem somehow. I see two alternatives to a type-based solution:
for (...) expr
is just a simple Iterable
and that's it. This means you would always have to apply ...
to use it as a sequenced argument.for (...) expr
is an Iterable
but using it for a for a sequenced parameter is treated as a special case so that ...
can be omitted.Your second proposal is still ambiguous. printLnEach(for (i in 1..3) i * i)
could be interpreted as a single (iterable) argument or as three (integer) arguments.
Your first proposal doesn't compose properly. For example, for (i in 1..3) for (j in 1..3) i*10 + j
could be an iterable of iterables of integers (since for (j in 1..3) i*10 + j
is an iterable) or an iterable of integers.
Your second proposal is still ambiguous.
printLnEach(for (i in 1..3) i * i)
could be interpreted as a single (iterable) argument or as three (integer) arguments.
The idea of the second solution is that this would always be treated as a special case (i.e. three integers in this case).
Your first proposal doesn't compose properly. For example,
for (i in 1..3) for (j in 1..3) i*10 + j
could be an iterable of iterables of integers (sincefor (j in 1..3) i*10 + j
is an iterable) or an iterable of integers.
Not sure what you mean, how could this result in an iterable of integers?
The interpretation depends on the precedence and associativity of the operators, and of course you can use parentheses if necessary. In this case for (i in 1..3) for (j in 1..3) i*10 + j
= for (i in 1..3) (for (j in 1..3) (i*10+j))
.
The idea of the second solution is that this would always be treated as a special case (i.e. three integers in this case).
Suppose printLn(Object)
already existed, then the implementer decided to extend it to handle multiple args, changing the signature to printLn(Object...)
. This special casing approach will then result in a change to semantics, though one I wouldn't expect anyone to anticipate.
Not sure what you mean, how could this result in an iterable of integers?
My understanding of the current proposals is that for (i in 1..3) for (j in 1..3) i*10 + j
is supposed to be a bunch of integers, in this case corresponding to 11, 12, 13, 21, 22, 23, 31, 32, 33
.
@RossTate you're right, that's how comprehensions are implemented. for (i in 1..3) for (j in 1..3) i*10 + j
results in 11, 12, 13, 21, 22, 23, 31, 32, 33
.
Ok, those are valid points. But I'm not giving up yet :-) So here's a different approach:
Instead of a Comprehension
interface derived from Iterable
, introduce a class Sequenced
encapsulating an Iterable
:
shared class Sequenced<out Element>(iterable) extends Object() {
shared Iterable<Element> iterable;
shared actual Boolean equals(Object other) {
if (is Sequenced<Element> other) { return iterable==other.iterable; }
return false;
}
shared actual Integer hash { return iterable.hash; }
shared actual String string { return iterable.string + "..."; }
}
The for
operator returns a Sequenced
. And x...
is defined as Sequenced(x)
for any Iterable
x
.
If an object obj
is passed to a sequenced parameter param
and obj
is Sequenced
then param=obj.iterable
(i.e. sequenced argument) else param={obj}
(i.e. single argument). In most cases this can be decided at compile time, but not always! In general, the argument is equivalent to what would be returned by the following function:
Iterable<Element> iterable<Element>(Sequenced<Element>|Element obj) {
if (is Sequenced<Element> obj) { return obj.iterable; }
if (is Element obj) { return {obj}; }
throw;
}
It may seem strange at first that it is sometimes a runtime decision whether the argument is a sequenced argument or not. But it actually makes sense: in printLn(obj)
the object obj
will only be passed as a sequenced argument if it is Sequenced
, and Iterable
and Sequenced
are separate types. So an Iterable
will not be unintendedly used as a sequenced argument.
The case mentioned by @RossTate where the parameter type is changed from Object
to Object...
can still occur - but not with an Iterable
, only with Sequenced
. I guess in this case the changing behavior is somewhat more understandable.
Examples:
printLn(for (i in 1..3) i*i); // sequenced
value iter1 = (for (i in 1..3) i*i).iterable;
value iter2 = elements { for (i in 1..3) i*i }; // equivalent to iter1
value seq = { for (i in 1..3) i*i }; // works as usual
printLn(iter1); // NOT sequenced
printLn(iter1...); // equivalent to printLn(Sequenced(iter1))
This allows an important change to the for
operator: the right-hand side is treated like a sequenced parameter. So if the RHS is Sequenced
then it is treated as a list of values, otherwise as a single value. All values from all iterations of the for
are aggregated into a single sequence. Sounds complicated but is actually fairly straight-forward:
printLn(for (i in 1..2) for (j in 1..2) i*10+j); // 11, 12, 21, 22
printLn(for (i in 1..2) { for (j in 1..2) i*10+j }); // { 11, 12 }, { 21, 22 }
value iter1 = elements { for (i in 1..2) for (j in 1..2) i*10+j }; // Iterable of Integers
value iter2 = elements { for (i in 1..2)
elements { for (j in 1..2) i*10+j } }; // Iterable of Iterables
printLn(for (i in 1..10) (myIterable(i)...)); // aggregate Iterables into a single Iterable
There are still issues. For example, now printLn(obj)
is different from printLn(obj.string)
. What happens when we have for(obj in objs) obj
and one of the objs
is a Sequenced
?
I have an idea combining your thoughts here with my idea for handling null
cleanly in ceylon/ceylon.language#136 (which no one has gotten back to me on). Wait a sec and I'll explain what I have in mind.
Okay, so in ceylon/ceylon.language#136, I break things up into class
types and case
types. Now I'm gonna extend that idea with, say, multi
types. Eventually we should combine multi
and case
types to get something very powerful, but for now let's just play basic multi
types.
To begin, Iterable
is a class
type, whereas your ideas of Comprehension
or Sequenced
would be multi
types. By keeping them distinct spaces, we avoid the ambiguity issues I was raising. In particular, Comprehension
and Sequenced
are not subtypes of Object
.
For now, a multi
type M
has the following grammar:
M ::= Nothing | C, M | M|M | C*
This is a lot like a regular expression with class types C
as letters, except with restrictions for sake of keeping things unambiguous, though here I've made them much more restrictive than necessary. Note that we can express null
: C?
is just the multi
type (C,Nothing)|Nothing
. We can also express tuples: A,B,Nothing
.
Now, an Iterable
is a class
type, and so for
comprehensions essentially convert iterables to multi
types. if
comprehensions also result in multi
types. null
has multi
type Nothing
.
On the other hand, {-}
turns a multi
type into a class
type. Now, the specific class
type this results in can depend on the the specific multi
type being captured. For example, if the multi
type is (C,Nothing)|Nothing
then {}
results in Maybe<C>
. So, if I do {array[5]}
I get a Maybe<C>
. If I do, {for (i in 0..10) array[i]}
, which has multi
pseduo-type (C?)*
which simplifies to multi
type C*
, I get an Iterable<C>
. If I do {"Hello", "world".length}
I get a Cons<String,Cons<Integer,Nil>>
.
Similarly, if I have a nonempty list of strings strs
(i.e. of type Cons<String,Iterable<String>>
), then {for (str in strs) strs.length}
will be a Cons<Integer,Iterable<Integer>>
. That is, the comprehension preserves non-emptiness in the type system.
Now, suppose a method foo
's returned multi
type is String, Integer, Nothing
. Then I could do value str, num = foo()
to break up the tuple into parts. Or, if I have an Iterable<String> strs
I could do if (fst, snd, rem* = strs...) { ... }
to attempt to break up strs
into its parts. This can lead into pattern matching if we want.
Okay, that's all I'm saying for now. Sorry it's so messy. This really is too much for text. The big thing again is separation between class
types (such as iterables) and multi
types (such as comprehensions) to prevent ambiguities. What do you think (if that made sense)?
For example, now
printLn(obj)
is different fromprintLn(obj.string)
.
I have no problem with that. obj.string
is not the same as obj
.
What happens when we have
for(obj in objs) obj
and one of theobjs
is aSequenced
?
Then one iteration contributes more values to the result than the others. I actually find it pretty cool that that's possible. Maybe there are situation where you want something to never be treated as a sequenced argument, even if it's Sequenced
. It would be easy to provide some kind of un-sequence method for this purpose, basically the inverse of ...
. But I'm not sure if that's necessary.
What do you think (if that made sense)?
It makes sense and is actually pretty interesting in theory. But it is also more type system complexity than I'm ready to accept for a language like Ceylon. I think the current Ceylon type system is pretty much at the limit of what we should confront users with. And while your idea is surely interesting I don't see sufficient benefit to justify crossing that limit.
And BTW: as explained earlier I'm also still not ok with Maybe<T>
;-)
Actually, the same argument I supplied before applies to your newest proposal:
Suppose
printLn(Object)
already existed, then the implementer decided to extend it to handle multiple args, changing the signature toprintLn(Object...)
. This special casing approach will then result in a change to semantics, though one I wouldn't expect anyone to anticipate.
As for mine being too complicated, you can always start with the simplest forms and then expand it as necessary. For example, start off with only two multi
types: C?
and C*
.
Actually, the same argument I supplied before applies to your newest proposal
I think you missed the part of my post that discusses this. In short: I don't find that very problematic in this case as the effect is restricted to Sequenced
and can not occur with Iterable
s anymore.
Another possibility just occurred to me: making Sequenced
a subtype of Void
instead of Object
, so that Void
would be of Object|Sequenced|Nothing
. But I haven't really thought about that yet, so maybe it's nonsense...
Whenever you say something is not a subtype of Object, you're essentially saying you want another kind of types that doesn't work quite like the existing types. I'm trying to embrace that rather than try to sidestep it so that we can make it more official, more structured, and more flexible.
I think this is getting too complicated, regarding the relatively simple problem of whether an argument is sequenced or not. So forget about Comprehension
, Sequenced
, multi types and other tricks. Here's a more pragmatic version of my proposal that uses only simple Iterable
s instead:
The current comprehension mechanism is replaced by the following new operators:
for (...) expr
: represents an Iterable
which evaluates the expression for each input element, skipping resulting null
values.if (...) expr
: replaces the then
operator, i.e. evaluates to null
if the condition is false.let (...) expr
: introduces a temporary attribute, as suggested earlier. Just executing an expression (let(i++)...
) should also be allowed as people would otherwise probably write let(dummy=i++)...
.aggregate (...) expr
: similar to for
, but the expression must be of type Iterable<T>?
and all resulting elements are aggregated into a single Iterable
.Of course these operators would not be allowed as the beginning of an expression statement, to avoid confusion with the for
and if
statements. If the operator precedence is chosen accordingly (for
/aggregate
/let
< else
< if
) then they can be used like the current comprehensions.
Examples of how for
can be used:
var it = for (i in 1..3) i*i; // Iterable
printLn(it...); // three Integers
printLn(it); // one Iterable
printLn(for (i in 1..3) i*i ...); // three Integers
printLn(for (i in 1..3) i*i); // one Iterable
printLn {
values = for (i in 1..3) i*i; // three Integers
};
That's a bit different from how it works now, but also much more regular and flexible. Here are examples for aggregate
:
var it1 = for (i in 1..2) for (j in 1..2) i*10+j; // Iterable of Iterables
var it2 = aggregate (i in 1..2) for (j in 1..2) i*10+j; // Iterable of Integers
var it3 = aggregate (i in 1..3) 1..i; // 1, 1, 2, 1, 2, 3
And this is how Iterable.indexed
could be implemented, also using the proposed =>
syntax:
shared default Iterable<Entry<Integer,Element&Object>> indexed =>
let (i:=0) for (e in this) let (++i) if (exists e) i->e;
So, if ts
is Iterable<T>
, then should for (t in ts) t
contain the same number of elements?
Alternatively, for (...) ...
doesn't result in an Iterable
, instead use {...}
to get an Iterable
. If you want a specific implementation, say an array, then do Array<T>(for (...) ...)
.
for (...) expr
: represents anIterable
which evaluates the expression for each input element, skipping resultingnull
values.if (...) expr
: replaces thethen
operator, i.e. evaluates tonull
if the condition is false.
This is certainly elegant, and I appreciate the regularity. Except:
if (x) y else z
. I guess I might hate it less if else were renamed to : as in if (x) y:z
but that's already getting cryptic.
aggregate (...) expr
: similar tofor
, but the expression must be of typeIterable<T>?
and all resulting elements are aggregated into a singleIterable
.
Perhaps I'm being dense, but I don't quite follow how aggregate()
can't just be a function.
Another possibility just occurred to me: making
Sequenced
a subtype ofVoid
instead ofObject
, so thatVoid
would be ofObject|Sequenced|Nothing
. But I haven't really thought about that yet, so maybe it's nonsense...
FTR, this doesn't sound like nonsense to me, and I'm not viscerally against it. It may be that the additional complexity of this solution is less worse than the wartiness of sequenced parameters. And from a philosophical point of view, it's not really wrong to say that I can have nothing, something, or many. Indeed, in many modeling languages, that's the basic way of expressing relationships.
I think we should explore this possibility further.
The basic problem with Object|Sequenced|Nothing
is that presumably our existing collection types, starting right at the top with Category
and Iterable
are instances of Sequenced
and not Object
. Currently Ceylon doesn't let you define an interface that is not a subtype of Object
, so we would need to slightly adjust the type system. I think this can be made to work out.
Furthernore, collections can't be passed to functions not explicitly declared to accept Sequenced
. At first blush, it feels like this probably would work out perfectly fine, since T...
would I suppose mean T|Sequenced<T>
, but it would certainly have far-reaching consequences that we would need to think all the way through.
Perhaps there is a slightly less aggressive solution where you have class Void() of Nothing|Object
and class Object() of One|Many {}
, that sidesteps these problems. It's not clear to me right now to what extent this would solve the real problems we're interested in, and to what extent it would introduce frustrating limitations.
Comprehensions are a really cool language feature and they would be even more useful if there was a way to declare and initialize values or variables that were internal to the comprehension. The keyword
given
can be used to enclose a declaration which can be used from that moment on:No need to state
value
orvariable
since it can be inferred from the declaration:=
means value,:=
means variable. Local type inference is already done.UPDATE: what we would really do, given all the evolution in the language since this was originally proposed, would be support the following: