ceylon / ceylon-spec

DEPRECATED
Apache License 2.0
108 stars 34 forks source link

let expressions in comprehensions #377

Open chochos opened 12 years ago

chochos commented 12 years ago

Comprehensions are a really cool language feature and they would be even more useful if there was a way to declare and initialize values or variables that were internal to the comprehension. The keyword given can be used to enclose a declaration which can be used from that moment on:

given (i:=0) for (x in xs) something(x,i++)

for (x in xs) given (t=x*x+sqrt(x)+someOtherExpensiveCalculationWith(x)) for (y in ys) x%y

No need to state value or variable since it can be inferred from the declaration: = means value, := means variable. Local type inference is already done.

UPDATE: what we would really do, given all the evolution in the language since this was originally proposed, would be support the following:

let (c=Counter()) for (x in xs) something(x,c.next())

for (x in xs) let (t=x*x+sqrt(x)+someOtherExpensiveCalculationWith(x)) for (y in ys) x%y
ikasiuk commented 12 years ago

Nice idea! I have also noticed in the past that something like that could be useful, and we recently had problems in that direction:

https://github.com/ceylon/ceylon.language/commit/e6210303ed21b264416c2e2d2ca152e8c023e558

RossTate commented 12 years ago

Snazzy! I think this in combination with the restriction I mentioned in that thread would be awesome.

gavinking commented 12 years ago

There's actually a second bit of this proposal that @chochos forgot to write down. The idea has two parts:

  1. allow given subclauses to declare locals, and
  2. allow expressions between subclauses.

The second bit would let you write:

given (i:=0) for (x in xs) i++ if (exists x) i->x

Which fixes what tripped us up in the definition of Iterable.indexed.

chochos commented 12 years ago

Right, forgot about expression between subclauses. Well that's the whole idea right there.

FroMage commented 12 years ago

First, I'd rather use the consecrated keyword for this and allow multiple vars to be defined:

let (i:=0;j=foo()) for (x in xs) something(x,i++)

With a semantics like letrec where each binding can see the other allowing you to define things like:

let (a:=() b();b=() a) for (x in xs) something(x,i++)

And I'd even make that available outside of comprehensions as an expression, and what the hell allow statements in it too:

value foo = let (a = f(); b = g()) { a * a + b * 1/b }

Mmm, that wouldn't be too useful if we could define inline statements like that:

value foo = { a = f(); b = g(); return a * a + b * 1/b; }

Which would be equivalent to:

// define a lambda that contains statements, which we call immediately
value foo = (function (){ a = f(); b = g(); return a * a + b * 1/b; })();

But that brings me to something that's been bothering me about comprehensions for a while:

for and if have completely different semantics depending on whether or not they are used as a statement or in a comprehension. I wouldn't see any problem if for and if could be used inside expressions, as their semantics would be very similar. But for an expression if we use the combination of then and else, which is already a bit irregular (test() then foo() else bar() could be written as if test() foo() else bar() or a variation with punctuation to make it more regular with the statement equivalent).

Now, I love comprehensions but I find it disconcerting and irregular that within them, the meanings of for and if are mapped not to their respective statements but to map and filter. I don't really have a better idea though, but I thought it'd be something worth noting, as regularity is something we care about.

Also withing a comprehension, the procedural rules stop applying: no need for brackets anymore, or semicolons, what looks like statements are expressions and they return values and the absence of else for an if has an entirely different meaning as we're suddenly building lists and mimicing the behaviour of map/filter (which again, is pretty cool). I have the feeling that comprehensions are a different language altogether with its own syntax. Adding new stuff to it, like local bindings makes it even more complex and even more alien to the rest of the language.

Don't take this as just me randomly bashing something: I love comprehensions and I think we should keep them and possibly extend them with local bindings, but I feel the syntax is too alien to the rest of Ceylon and I am afraid we're introducing a language-within-a-language syntax.

That statement in particular looks entirely like a different language:

given (i:=0) for (x in xs) i++ if (exists x) i->x

Would it make any sense in allowing for, while, if, switch and blocks in expression contexts and give them an intuitive meaning that would allow us to have something even more powerful than comprehensions?

Rewriting this statement with a more Ceylon-like syntax:

value foo = { variable i:=0; for (x in xs){ i++; if (exists x) i->x; }}

Which would even be valid when used as a statement:

variable i:=0; 
for (x in xs){ 
 i++; 
 if (exists x)  // well, if we allow single-statement ifs to drop braces ;)
  i->x; 
}

It's just an idea, but I have a feeling that would be more powerful, intuitive and regular.

What's the return type of an expresion block? Its last statement's return type, or that of its 'return' statements. Return type of a for or while expression is again its last statement. Return type of an if is union of last statement of then/else blocks.

Wouldn't something like that work?

gavinking commented 12 years ago

I'm fine with let here. Don't love the defining-multiple-variables-in-one-let bit just because we don't let you do that in other places.

I also agree that if we introduce this feature in comprehensions, then we'll also need to explore where else in the language it would be useful. For example, regularity suggests the following:

let (x = something()) {
    //x is defined just in this local scope
}
gavinking commented 12 years ago

Now, I love comprehensions but I find it disconcerting and irregular that within them, the meanings of for and if are mapped not to their respective statements but to map and filter.

FTR, that's not actually how it works under the covers. The idea of mapping to map() and filter() had problems because we allow products.

Would it make any sense in allowing for, while, if, switch and blocks in expression contexts and give them an intuitive meaning that would allow us to have something even more powerful than comprehensions?

I spent a lot of time thinking about this, on several occasions, and just never came up with anything satisfying:

RossTate commented 12 years ago

for and if have completely different semantics depending on whether or not they are used as a statement or in a comprehension.

I think of them as actually very similar. For example, if means "if this condition holds (at run time), then include this statement/expression here (at run time)". They get implemented slightly differently and they have slightly different syntax, but to me they are very much akin. (@gavinking, note that this interpretation gives a semantics for while in comprehensions.)

As for the curly braces, the sole purpose of curly braces is to disambiguate nesting. With comprehensions we have decided to optimize for what we expect to be the common case: everything nests. That is, everything after a for or an if is inside the for/if. This lets us and implies we should get rid of curly braces since there is no ambiguity. Of course there is a cost to this: we can't express comprehensions that require more complex nesting behavior. In particular, this includes things like else and switch.

I do find it weird that we get rid of semicolon after nested statements though. In fact, I'm a little worried that'll prevent us from using any syntax that relies on the presence of a semicolon in normal statements to disambiguate.

gavinking commented 12 years ago

I agree with @RossTate—I don't think it's fair to say they have different semantics at all.

quintesse commented 12 years ago

Hey, if we allow

let (x = something()) {
    //x is defined just in this local scope
}

why not go all Haskell-y and allow

{
    //x is defined just in this local scope
} given (x = something())

as well? :)

gavinking commented 12 years ago

So the tangential discussion on this thread fits very much with the tangential discussion on #363. Ceylon today, like most mainstream languages, is "statement-oriented" and optimized for methods with multiple statements. Now I personally just love to write methods/getters with just single-expressions wherever that is reasonable, and for that reason I kinda like languages which are more "expression-oriented" in their syntax. Which is why we have stuff like comprehensions and then/else, I guess.

Up until now, I have not been that keen on introducing something like:

function name() => person.name.first + " " + person.name.last;

because once you try to refactor out common sub-expressions, you need a big change to the syntax (sure, the IDE can do this for you, but still). So it's not a syntax that "scales". It's an abbreviation that winds up getting in the way when you start maintaining the code.

On the other hand, let in expressions didn't make much sense either, since you could just refactor out the common subexpression to a statement:

function name() {
    Name n = person.name;
    return n.first + " " + n.last;
}

However, if we have both these features, then we have something that makes sense. It becomes possible to write the above in the following "expression-oriented" form:

function name() => let (n = person.name) n.first + " " + n.last;

So then we need to answer the question: is this an improvement over the "statement-oriented" form, and is having two ways to write the same thing a good or a bad thing in this case. I'm interested to know what you guys think.

I personally find the definition with => and let rather nice to look at. But it's hard to argue that it's objectively better when it involves really about the same number of tokens. Still, even if counting tokens can't explain my like for it, perhaps there's some other explanation...

gavinking commented 12 years ago

@quintesse FTR, I actually prefer the keyword given in both locations, for how it reads in English. I'm not especially keen on the postfix given because it's an irregular syntax compared to everything else we have in the language.

quintesse commented 12 years ago

Sure, I wasn't actually being serious, especially because in Haskell it's used to even define localized functions where it makes more sense to put them after the "important" work.

PS I'm not sure how I feel about the short-cut syntax yet, but it sure makes me want the other Haskell syntax of being able to do:

Integer fib(0) => 0;
Integer fib(1) => 1;
Integer fib(n) => f(n-1) + f(n-2);

hehe

FroMage commented 12 years ago

what the fuck should while evaluate to?!

An iterator.

if has a reasonable interpretation in an expression context, but if (foo) bar else baz is just visually horrible.

If punctuation for expressions is optional (like for curly braces, then surely for single braces too: if foo else bar

I don't think it's fair to say they have different semantics at all.

Perhaps you're right, but it's still a language-within a language. Again, perhaps that's not an issue at all, but I wonder if it really isn't, and if we can't do something to unify both.

FroMage commented 12 years ago

I think the syntax you are pointing to depends on your background. For me it's C, so even in languages like Scheme or JavaScript where both alternatives are possible I find myself writing:

function name(){
  var n = person.name;
  return n.first + " " + n.last;
}

Rather than:

var name = function(){
  var n = person.name;
  return n.first + " " + n.last;
}

Same in Scheme with:

(define (name) 
 (let ((n person.name)) 
  (append n.first " " n.last)))

rather than:

(define name (lambda ()
 (let ((n person.name)) 
  (append n.first " " n.last))))

Personally I really have trouble making any sense of function name() => person.name.first + " " + person.name.last; but just turning it into function name() { return person.name.first + " " + person.name.last;} makes it readable for me. But I really think it's a background thing, more about habits than quantifiable things.

gavinking commented 12 years ago

If punctuation for expressions is optional (like for curly braces, then surely for single braces too: if foo else bar

Eh? I can't even hazard a guess what if foo else bar might mean... I assume that's a typo?

You're going to need some kind of punctuation to separate the condition from the first expression. So you have a choice between if (foo) bar else baz which is just awfully asymmetric, and if foo then bar else baz, which is already different to the traditional C syntax, or you could go back to something more like C's ternary operator syntax, foo ? bar:baz or if (foo) bar:baz. Personally, I think then and else are superior to any of these options.

FroMage commented 12 years ago

Eh? I can't even hazard a guess what if foo else bar might mean... I assume that's a typo?

Hah, yes of course, sorry: if foo bar else gee

gavinking commented 12 years ago

@FroMage Again FTR, I have a rather extreme aversion to:

value sqr = (Float x) x*y;

To the extent that I think it's almost unfortunate that Ceylon lets you write this. The fact that this even works is more of an unintended consequence—the intersection of two language features that aren't really designed for use together—than an intentional feature. I suppose that everyone here would agree that what we actually want people to write in this case is just:

function sqr(Float x) { return x*y; }

So, indeed, this might be the strongest argument yet for supporting the form:

function sqr(Float x) => x*y;

i.e. if we don't support that Dart/Coffeescript-style arrow syntax, then some people are going to want to use the "value = anon function" form like you quite often see them do in the Scala community. (Apparently in the world of Scala, this idiom is not frowned upon.)

FroMage commented 12 years ago

And function f() => e; would just be a parser alias for function f() { return e; } ?

gavinking commented 12 years ago

@FroMage Right. I guess it's something I could implement in the parser in like 5-10 mins, though I might not do it exactly like that. Supporting let would be a little more work, and would impact the backend, but it's still pretty straightforward, I suppose. (Though I have not put a huge amount of though into it.)

ikasiuk commented 12 years ago

So then we need to answer the question: is this an improvement over the "statement-oriented" form, and is having two ways to write the same thing a good or a bad thing in this case. I'm interested to know what you guys think.

My opinion: it looks kind of nice, but not soo much nicer than the current syntax. And having two ways to write the same thing as a bad thing in this case.

An important goal of Ceylon, according to the home page, is readability. If I interpret this goal correctly then it should lead us to a syntax that is as homogeneous, regular and easy to learn as possible. I think if we take that goal seriously then we can only introduce a new construct into the language if it solves a real, significant new problem. And I just don't see a sufficient benefit in this case.

i.e. if we don't support that Dart/Coffeescript-style arrow syntax, then some people are going to want to use the "value = anon function" form like you quite often see them do in the Scala community. (Apparently in the world of Scala, this idiom is not frowned upon.)

So we have two ways of expressing the same thing but one of them is not so nice. So we introduce a third possibility to lure people away from the one we want to hide ;-) Ok, I understand the reasoning and I also don't really like the current syntactic duality of functions. It would be great if we could find a way to improve that. But I don't think that this is the right way.

gavinking commented 12 years ago

@ikasiuk I agree with all your points, but note that there are a couple of different factors pushing in this direction:

  1. the irregularity of the scope of a type parameter in a type alias definition, as discussed in #363,
  2. the fact that function sqr(Float x) { return x*x; } is a little heavy on the eyes compared to some other recent languages like coffescript and dart, and
  3. the fear that, in response to 2, some people will be motivated to start writing value sqr = (Float x) x*x;

I don't think that 2+3 on their own would be enough to motivate me to want to make this change. But if other considerations are "pushing" in the same direction—that is, if it just feels like the language naturally want to grow in that direction—well, then that's a different matter...

But I don't think that this is the right way.

Can you think of a different syntax for partial application of a function / generic type? If we came up with something natural, then that might be a different approach that would give many of the same advantages. The trouble is, when I think about that problem, I start coming up with stuff like:

function sqr(Float x) = Integer.power(*)(2);

which I think is just not at all ceylonic (it's actually just Scala's horrible underscore in disguise), and is actually much less flexible than the fat arrow.

gavinking commented 12 years ago

I start coming up with stuff like:

function sqr(Float x) = Integer.power(*)(2);

which I think is just not at all ceylonic.

Note that according to the language spec as it exists today, I should be able to write a general-purpose function named shuffle() that would let me define sqr() like this:

function sqr(Float x) = shuffle(Integer.power)(2);

i.e. shuffle() swaps the first and second parameter lists of the higher-order function Integer.power, letting you apply the second argument list (the method arguments) before the first (the method receiver).

Now, that's all very cool and powerful but I think it's exactly the kind of thing we don't want people to be tempted to use in "everyday" code.

gavinking commented 12 years ago

Hah, yes of course, sorry: if foo bar else gee

This can't be parsed, consider stuff like:

if x + y + z ...
if x { y } ...

We can't tell where the condition ends, at least not with finite lookahead in a CFG.

ikasiuk commented 12 years ago

Two remarks: I prefer given over let, and I don't share the aversion against value f=(Float x)x*y; (although I personally prefer the alternative).

And one question (I probably just missed that): Why do we need to introduce a new symbol => instead of just using =?

FroMage commented 12 years ago

And one question (I probably just missed that): Why do we need to introduce a new symbol => instead of just using =?

The explanation is in #363, which is actually where most of the explanation for the => is.

Note that let is the keyword used in JavaScript as well, so it's well-known.

ikasiuk commented 12 years ago

Note that let is the keyword used in JavaScript as well, so it's well-known.

let in JavaScript is apparently not widely supported, so "well-known" might be exaggerated. And in general it seems that in Ceylon we tend to prefer the keyword that fits best, rather than what is most popular in other languages.

The explanation is in #363, which is actually where most of the explanation for the => is.

Ah ok, thanks. I agree that => makes sense for type aliases because they are too different from simple assignment. So => would basically mean "alias". But I'm still not sure about using it for functions. Assuming that we introduce => for types and functions, how exactly would it work?

Float x => retrieveX()
assign x => storeX(x);
ikasiuk commented 12 years ago

No need to state value or variable since it can be inferred from the declaration: = means value, := means variable.

That reminds me that the variable and := in something like

variable Integer i := 0;

always feel somewhat redundant. Could we say that an attribute that is initialized with := is always automatically variable even if that annotation is omitted? Maybe we could use the same rule as for type inference and allow that only for non-shared attributes.

chochos commented 12 years ago

I talked about that with @gavinking once. I think the idea is that if you must be sure that you want mutability, that's why you have to type that long keyword even if it seems redundant. That's the part about encouraging immutability - if we only leave := then it's not so clear that you are declaring a variable and not an immutable value.

ikasiuk commented 12 years ago

I talked about that with @gavinking once. I think the idea is that if you must be sure that you want mutability, that's why you have to type that long keyword even if it seems redundant. That's the part about encouraging immutability - if we only leave := then it's not so clear that you are declaring a variable and not an immutable value.

I agree for shared attributes: the interface of a type shouldn't be defined by such implicit mechanisms. But I don't see a problem with non-shared attributes. It's rather unlikely that you erroneously write := instead of = and then actually produce a situation where that causes any harm. After all, we also allow type inference although we can't check if the inferred type is really the one intended by the user.

ikasiuk commented 12 years ago

Now I know what felt wrong about the proposed => for functions: it's inconsistent unless we make the same change for anonymous functions. If we write a function that consists of a single expression as

Float sqr(Float x) => x**2;

then we are more or less forced to write anonymous functions in the same way:

values.map((Float x) => x**2);

That actually looks surprisingly readable.

ikasiuk commented 12 years ago

Mmm, that wouldn't be too useful if we could define inline statements like that:

value foo = { a = f(); b = g(); return a * a + b * 1/b; }

There are a couple of problems with that. Firstly we would probably have to use [/] for sequence literals instead of {/}, to avoid ambiguity (that's not necessarily a problem of course). Using return could also be rather confusing if the statement appears in a function. In Scala a block of statements simply represents the value resulting from the last statement in the block. I'm not a big fan of that because it often doesn't read very well. Another possibility would be a "secondary" return statement (e.g. yield). But to be honest I don't think this kind of block expression is desirable at all because it makes the code structure less homogeneous and reduces readability.

Would it make any sense in allowing for, while, if, switch and blocks in expression contexts and give them an intuitive meaning that would allow us to have something even more powerful than comprehensions?

I think that wouldn't only be very difficult but also wouldn't fit into the language very well. But that doesn't mean that a slightly more generalized mechanism for comprehensions is impossible:

The first problem is that the concept of a comprehension isn't represented optimally in the type hierarchy. A comprehension produces an Iterable. But when passing a comprehension to a sequenced parameter it behaves differently than when passing an Iterable directly: a comprehension is basically an Iterable plus .... That's not a big problem as long as comprehensions can only be used for sequenced parameters anyway. But if we want to make them more general-purpose expressions then that's something we have to address.

Possible solution: a comprehension represents an object of type Comprehension, and Comprehension is a simple marker interface derived from Iterable. If the static type of a value passed to a sequenced parameter satisfies Comprehension then it is automatically used as an argument sequence, otherwise it's used as a single argument unless ... is applied.

The generalization of the comprehension mechanism could be achieved by introducing three new operators:

Of course these operators would not be allowed as the beginning of an expression statement, to avoid confusion with the for and if statements. If the operator precedence is chosen accordingly (for/let < else < if) then they can completely replace the current comprehension mechanism:

value comp = let (i := 0) for (e in elems) if (exists e) ++i->e;

Unfortunately this doesn't allow expressions between subclauses, so people might end up abusing the let operator for that purpose: let (dummy = ++i). A simple solution may be to allow the following:

value comp = let (i := 0) for (e in elems) let (++i) if (exists e) i->e;

This obviates the need for the elements function, as seen above (because a Comprehension is an Iterable). And of course we can also still write:

value seq = { let (i := 0) for (e in elems) let (++i) if (exists e) i->e; };

Another big advantage is that we can use the individual operators independently in expressions and in particular with the proposed => syntax for functions.

And we can directly use comprehensions also for parameters of type Iterable, not just for sequenced parameters.

RossTate commented 12 years ago

While there are aspects of your proposal I like, making Comprehension an actual type has some issues, primarily with ambiguity. For example, consider the following code (assuming printLnEach(Object...) is defined):

value seq = for (i in 1..3) i * i;
printLnEach(seq);

If we infer type Iterable<Integer> for seq, then this'll print 1\n2\n3\n. If instead we infer type Comprehension<Integer> for seq, then this'll print {1,2,3}\n. So now type inference affects semantics. In general, this issue has to do with behavioral subtyping, a property we'd like but which is incompatible with Comprehension as a specially handled type.

ikasiuk commented 12 years ago

@RossTate: Well, strictly speaking it's not ambiguous because obviously the inferred type would always be the same (and well-defined). But I do see that making the behavior depend on the static type of the expression could cause problems.

But it is clear that expression-based comprehensions would require us to solve that problem somehow. I see two alternatives to a type-based solution:

RossTate commented 12 years ago

Your second proposal is still ambiguous. printLnEach(for (i in 1..3) i * i) could be interpreted as a single (iterable) argument or as three (integer) arguments.

Your first proposal doesn't compose properly. For example, for (i in 1..3) for (j in 1..3) i*10 + j could be an iterable of iterables of integers (since for (j in 1..3) i*10 + j is an iterable) or an iterable of integers.

ikasiuk commented 12 years ago

Your second proposal is still ambiguous. printLnEach(for (i in 1..3) i * i) could be interpreted as a single (iterable) argument or as three (integer) arguments.

The idea of the second solution is that this would always be treated as a special case (i.e. three integers in this case).

Your first proposal doesn't compose properly. For example, for (i in 1..3) for (j in 1..3) i*10 + j could be an iterable of iterables of integers (since for (j in 1..3) i*10 + j is an iterable) or an iterable of integers.

Not sure what you mean, how could this result in an iterable of integers? The interpretation depends on the precedence and associativity of the operators, and of course you can use parentheses if necessary. In this case for (i in 1..3) for (j in 1..3) i*10 + j = for (i in 1..3) (for (j in 1..3) (i*10+j)).

RossTate commented 12 years ago

The idea of the second solution is that this would always be treated as a special case (i.e. three integers in this case).

Suppose printLn(Object) already existed, then the implementer decided to extend it to handle multiple args, changing the signature to printLn(Object...). This special casing approach will then result in a change to semantics, though one I wouldn't expect anyone to anticipate.

Not sure what you mean, how could this result in an iterable of integers?

My understanding of the current proposals is that for (i in 1..3) for (j in 1..3) i*10 + j is supposed to be a bunch of integers, in this case corresponding to 11, 12, 13, 21, 22, 23, 31, 32, 33.

chochos commented 12 years ago

@RossTate you're right, that's how comprehensions are implemented. for (i in 1..3) for (j in 1..3) i*10 + j results in 11, 12, 13, 21, 22, 23, 31, 32, 33.

ikasiuk commented 12 years ago

Ok, those are valid points. But I'm not giving up yet :-) So here's a different approach:

Instead of a Comprehension interface derived from Iterable, introduce a class Sequenced encapsulating an Iterable:

shared class Sequenced<out Element>(iterable) extends Object() {
    shared Iterable<Element> iterable;
    shared actual Boolean equals(Object other) {
        if (is Sequenced<Element> other) { return iterable==other.iterable; }
        return false;
    }
    shared actual Integer hash { return iterable.hash; }
    shared actual String string { return iterable.string + "..."; }
}

The for operator returns a Sequenced. And x... is defined as Sequenced(x) for any Iterable x.

If an object obj is passed to a sequenced parameter param and obj is Sequenced then param=obj.iterable (i.e. sequenced argument) else param={obj} (i.e. single argument). In most cases this can be decided at compile time, but not always! In general, the argument is equivalent to what would be returned by the following function:

Iterable<Element> iterable<Element>(Sequenced<Element>|Element obj) {
    if (is Sequenced<Element> obj) { return obj.iterable; }
    if (is Element obj) { return {obj}; }
    throw;
}

It may seem strange at first that it is sometimes a runtime decision whether the argument is a sequenced argument or not. But it actually makes sense: in printLn(obj) the object obj will only be passed as a sequenced argument if it is Sequenced, and Iterable and Sequenced are separate types. So an Iterable will not be unintendedly used as a sequenced argument.

The case mentioned by @RossTate where the parameter type is changed from Object to Object... can still occur - but not with an Iterable, only with Sequenced. I guess in this case the changing behavior is somewhat more understandable.

Examples:

printLn(for (i in 1..3) i*i);  // sequenced
value iter1 = (for (i in 1..3) i*i).iterable;
value iter2 = elements { for (i in 1..3) i*i }; // equivalent to iter1
value seq = { for (i in 1..3) i*i }; // works as usual
printLn(iter1); // NOT sequenced
printLn(iter1...); // equivalent to printLn(Sequenced(iter1))

This allows an important change to the for operator: the right-hand side is treated like a sequenced parameter. So if the RHS is Sequenced then it is treated as a list of values, otherwise as a single value. All values from all iterations of the for are aggregated into a single sequence. Sounds complicated but is actually fairly straight-forward:

printLn(for (i in 1..2) for (j in 1..2) i*10+j); // 11, 12, 21, 22
printLn(for (i in 1..2) { for (j in 1..2) i*10+j }); // { 11, 12 }, { 21, 22 }
value iter1 = elements { for (i in 1..2) for (j in 1..2) i*10+j }; // Iterable of Integers
value iter2 = elements { for (i in 1..2)
                         elements { for (j in 1..2) i*10+j } }; // Iterable of Iterables
printLn(for (i in 1..10) (myIterable(i)...)); // aggregate Iterables into a single Iterable
RossTate commented 12 years ago

There are still issues. For example, now printLn(obj) is different from printLn(obj.string). What happens when we have for(obj in objs) obj and one of the objs is a Sequenced?

I have an idea combining your thoughts here with my idea for handling null cleanly in ceylon/ceylon.language#136 (which no one has gotten back to me on). Wait a sec and I'll explain what I have in mind.

RossTate commented 12 years ago

Okay, so in ceylon/ceylon.language#136, I break things up into class types and case types. Now I'm gonna extend that idea with, say, multi types. Eventually we should combine multi and case types to get something very powerful, but for now let's just play basic multi types.

To begin, Iterable is a class type, whereas your ideas of Comprehension or Sequenced would be multi types. By keeping them distinct spaces, we avoid the ambiguity issues I was raising. In particular, Comprehension and Sequenced are not subtypes of Object.

For now, a multi type M has the following grammar:

M ::= Nothing | C, M | M|M | C*

This is a lot like a regular expression with class types C as letters, except with restrictions for sake of keeping things unambiguous, though here I've made them much more restrictive than necessary. Note that we can express null: C? is just the multi type (C,Nothing)|Nothing. We can also express tuples: A,B,Nothing.

Now, an Iterable is a class type, and so for comprehensions essentially convert iterables to multi types. if comprehensions also result in multi types. null has multi type Nothing.

On the other hand, {-} turns a multi type into a class type. Now, the specific class type this results in can depend on the the specific multi type being captured. For example, if the multi type is (C,Nothing)|Nothing then {} results in Maybe<C>. So, if I do {array[5]} I get a Maybe<C>. If I do, {for (i in 0..10) array[i]}, which has multi pseduo-type (C?)* which simplifies to multi type C*, I get an Iterable<C>. If I do {"Hello", "world".length} I get a Cons<String,Cons<Integer,Nil>>.

Similarly, if I have a nonempty list of strings strs (i.e. of type Cons<String,Iterable<String>>), then {for (str in strs) strs.length} will be a Cons<Integer,Iterable<Integer>>. That is, the comprehension preserves non-emptiness in the type system.

Now, suppose a method foo's returned multi type is String, Integer, Nothing. Then I could do value str, num = foo() to break up the tuple into parts. Or, if I have an Iterable<String> strs I could do if (fst, snd, rem* = strs...) { ... } to attempt to break up strs into its parts. This can lead into pattern matching if we want.

Okay, that's all I'm saying for now. Sorry it's so messy. This really is too much for text. The big thing again is separation between class types (such as iterables) and multi types (such as comprehensions) to prevent ambiguities. What do you think (if that made sense)?

ikasiuk commented 12 years ago

For example, now printLn(obj) is different from printLn(obj.string).

I have no problem with that. obj.string is not the same as obj.

What happens when we have for(obj in objs) obj and one of the objs is a Sequenced?

Then one iteration contributes more values to the result than the others. I actually find it pretty cool that that's possible. Maybe there are situation where you want something to never be treated as a sequenced argument, even if it's Sequenced. It would be easy to provide some kind of un-sequence method for this purpose, basically the inverse of .... But I'm not sure if that's necessary.

What do you think (if that made sense)?

It makes sense and is actually pretty interesting in theory. But it is also more type system complexity than I'm ready to accept for a language like Ceylon. I think the current Ceylon type system is pretty much at the limit of what we should confront users with. And while your idea is surely interesting I don't see sufficient benefit to justify crossing that limit.

And BTW: as explained earlier I'm also still not ok with Maybe<T> ;-)

RossTate commented 12 years ago

Actually, the same argument I supplied before applies to your newest proposal:

Suppose printLn(Object) already existed, then the implementer decided to extend it to handle multiple args, changing the signature to printLn(Object...). This special casing approach will then result in a change to semantics, though one I wouldn't expect anyone to anticipate.

As for mine being too complicated, you can always start with the simplest forms and then expand it as necessary. For example, start off with only two multi types: C? and C*.

ikasiuk commented 12 years ago

Actually, the same argument I supplied before applies to your newest proposal

I think you missed the part of my post that discusses this. In short: I don't find that very problematic in this case as the effect is restricted to Sequenced and can not occur with Iterables anymore.

Another possibility just occurred to me: making Sequenced a subtype of Void instead of Object, so that Void would be of Object|Sequenced|Nothing. But I haven't really thought about that yet, so maybe it's nonsense...

RossTate commented 12 years ago

Whenever you say something is not a subtype of Object, you're essentially saying you want another kind of types that doesn't work quite like the existing types. I'm trying to embrace that rather than try to sidestep it so that we can make it more official, more structured, and more flexible.

ikasiuk commented 12 years ago

I think this is getting too complicated, regarding the relatively simple problem of whether an argument is sequenced or not. So forget about Comprehension, Sequenced, multi types and other tricks. Here's a more pragmatic version of my proposal that uses only simple Iterables instead:

The current comprehension mechanism is replaced by the following new operators:

Of course these operators would not be allowed as the beginning of an expression statement, to avoid confusion with the for and if statements. If the operator precedence is chosen accordingly (for/aggregate/let < else < if) then they can be used like the current comprehensions.

Examples of how for can be used:

var it = for (i in 1..3) i*i; // Iterable
printLn(it...); // three Integers
printLn(it); // one Iterable
printLn(for (i in 1..3) i*i ...); // three Integers
printLn(for (i in 1..3) i*i); // one Iterable
printLn {
    values = for (i in 1..3) i*i; // three Integers
};

That's a bit different from how it works now, but also much more regular and flexible. Here are examples for aggregate:

var it1 = for (i in 1..2) for (j in 1..2) i*10+j; // Iterable of Iterables
var it2 = aggregate (i in 1..2) for (j in 1..2) i*10+j; // Iterable of Integers
var it3 = aggregate (i in 1..3) 1..i; // 1, 1, 2, 1, 2, 3

And this is how Iterable.indexed could be implemented, also using the proposed => syntax:

shared default Iterable<Entry<Integer,Element&Object>> indexed =>
            let (i:=0) for (e in this) let (++i) if (exists e) i->e;
RossTate commented 12 years ago

So, if ts is Iterable<T>, then should for (t in ts) t contain the same number of elements?

Alternatively, for (...) ... doesn't result in an Iterable, instead use {...} to get an Iterable. If you want a specific implementation, say an array, then do Array<T>(for (...) ...).

gavinking commented 12 years ago
  • for (...) expr: represents an Iterable which evaluates the expression for each input element, skipping resulting null values.
  • if (...) expr: replaces the then operator, i.e. evaluates to null if the condition is false.

This is certainly elegant, and I appreciate the regularity. Except:

  • aggregate (...) expr: similar to for, but the expression must be of type Iterable<T>? and all resulting elements are aggregated into a single Iterable.

Perhaps I'm being dense, but I don't quite follow how aggregate() can't just be a function.

gavinking commented 12 years ago

Another possibility just occurred to me: making Sequenced a subtype of Void instead of Object, so that Void would be of Object|Sequenced|Nothing. But I haven't really thought about that yet, so maybe it's nonsense...

FTR, this doesn't sound like nonsense to me, and I'm not viscerally against it. It may be that the additional complexity of this solution is less worse than the wartiness of sequenced parameters. And from a philosophical point of view, it's not really wrong to say that I can have nothing, something, or many. Indeed, in many modeling languages, that's the basic way of expressing relationships.

I think we should explore this possibility further.

gavinking commented 12 years ago

The basic problem with Object|Sequenced|Nothing is that presumably our existing collection types, starting right at the top with Category and Iterable are instances of Sequenced and not Object. Currently Ceylon doesn't let you define an interface that is not a subtype of Object, so we would need to slightly adjust the type system. I think this can be made to work out.

Furthernore, collections can't be passed to functions not explicitly declared to accept Sequenced. At first blush, it feels like this probably would work out perfectly fine, since T... would I suppose mean T|Sequenced<T>, but it would certainly have far-reaching consequences that we would need to think all the way through.

Perhaps there is a slightly less aggressive solution where you have class Void() of Nothing|Object and class Object() of One|Many {}, that sidesteps these problems. It's not clear to me right now to what extent this would solve the real problems we're interested in, and to what extent it would introduce frustrating limitations.