Open lucono opened 7 years ago
I really don't like this, it looks nice, but isn't at all regular. I especially dislike the "magical identifier" "it
".
Additionally, what would the following print?
void foo(Boolean(Boolean)|Boolean() predicate)
{
if(is Boolean(Boolean) predicate)
{
if(predicate(true))
{
print("1");
}
else
{
print("2");
}
}
else
{
if(predicate())
{
print("3");
}
else
{
print("4");
}
}
}
shared void run()
{
Boolean it = false;
foo{: => it};
}
I'm now in favor of adding a very limited form of this feature to 1.4, but by making it
a keyword. Thus, the syntax would be:
ints.each(print(it));
The difference between my proposal, and what is proposed above, is this would be only for single-expression anonymous functions, and not for blocks.
Oh, I see, there's an ambiguity here with respect to expression like foo(bar(baz(it))
. Which of foo()
, bar()
, and baz()
does it
belong to? Still it should be possible to resolve that ambiguity during typechecking.
And I think requiring:
ints.each(=>print(it));
Would undermine most of what's nice about the feature.
I dislike this feature as you proposed, @gavinking, but there is a more disciplined version that I feel fits in more in the language.
Considering #5989, I feel like it
could be the keyword for inferring the type of static references. This way, one would write what you wrote in #7215 as the following:
persons
.filter((person) => 18 <= person.age)
.sortedBy(it.name)
.map(it.email)
.forEach(print);
In addition, if there were an operator for function composition (I'll use "/compose/
", in the lack of a good symbol), one could write the .filter
call as .filter(18.notLongerThan /compose/ it.age)
. It's interesting to note that such operator would be useful in its own right, not only when used together with this feature.
The problem would be when calling functions, as in .forEach(it.setAge(20))
, which would need to be written as .forEach(shuffle(it.setAge)(20))
.
@Zambonifofex I think what you're proposing is almost completely useless, since it doesn't reduce the token count from what I can already write today in Ceylon. And I don't see how, if it
is so objectionable as a way to infer a parameter, it could possibly be unobjectionable as a way to refer to the type of that parameter.
Nor do I see how it's more "disciplined". It's merely less useful.
I have pushed a prototype implementation to the 7190 branch, so folks who are interested can play with it. (There are surely bugs; I've spent about one hour working on this.)
@gavinking
it doesn't reduce the token count from what I can already write today in Ceylon
Types can be longer than a single token.
And if you don't like "it
" as the name, you could use "type
", for example. Either way, I felt the need for a feature like this several times in the past; I just feel like it would be more generally useful than what you're proposing.
But if you really don't like my suggestion, I could get used to your idea if you introduced some indication of which function the "it
" actually belongs to. Since you don't like "=>
", what about ":
"?
persons
.filter(:it.age >= 18)
.sortedBy(:it.name)
.map(:it.email)
.forEach(:print(it));
And I also don't like the fact "it
" isn't a keyword in your implementation. As I've said, I dislike "magical identifiers" like these; there is no indication that they are declared, which might hurt readability.
Replying to @gavinking’s comment on another issue:
persons .filter(it.age >= 18) .sortedBy(it.name) .map(it.email) .forEach(print(it))
Is far more readable than:
persons .filter((p) => p.age >= 18) .sortedBy((p) => p.name) .map((p) => p.email) .forEach(print)
Perhaps, but I think this is the most readable incarnation of the same code:
persons
.filter((person) => person.age >= 18)
.sortedBy(Person.name)
.map(Person.email)
.forEach(print)
In each line that does something person
-related, you’re reminded that you’re operating on person
s, not p
s or it
s. This may be slightly more work to type, but code is more often read than written and so on.
Which of
foo()
,bar()
, andbaz()
does it belong to? Still it should be possible to resolve that ambiguity during typechecking.
Yes, but we all know the typechecker is smarter than the programmer. Don’t force me to repeat the typechecker’s reasoning in my head to make sense of this code.
In the end, I personally don’t believe a further abbreviation syntax for this is necessary. We already have static references for th simplest cases (map(Person.email)
), and if your lambda is more complicated, I personally don’t think a short (it) =>
is too much to ask.
@lucaswerkmeister
Perhaps, but I think this is the most readable incarnation of the same code: ```ceylon persons .filter((person) => person.age >= 18) .sortedBy(Person.name) .map(Person.email) .forEach(print) ```
I agree that Person.name
, Person.email
is quite good. And it works as long as all I'm doing is using attributes. But the approach quickly breaks down as soon as I need to call a method of Person
, and so in practice I've found it sometimes useful, but more often than not I do have to use the longer form with the (p) =>
This may be slightly more work to type, but code is more often read than written and so on.
I'm not motivated by a desire to avoid typing. I'm motivated by how nice I think the code for stream processing looks in some other languages that have this feature.
if your lambda is more complicated, I personally don’t think a short
(it) =>
is too much to ask.
It doesn't have to be very complicated at all: it.longerThan(10)
, it.sequence()
, it==' '
.
I dislike "magical identifiers" like these; there is no indication that they are declared, which might hurt readability.
Well the IDE would of course give a visual indication that it
is special. But sure, on reflection, I agree. For consistency with this
, super
, outer
, which also don't really need to be reserved words, it
should probably be a keyword too. (Unless we made none of them reserved words.)
I've pretty much finished the implementation of this. And after trying it out in my example apps, I'm sorry to conclude that I pretty much love it
:
value evenCubes =
LongStream
.iterate(1, it+1)
.parallel()
.map(it^3)
.filter(2.divides)
.mapToObj(it)
.limit(20)
.collect(toList<Integer>());
value map =
Stream
.with(*text.split(it==','))
.parallel()
.map(it.trimmed)
.filter(it.shorterThan(10))
.collect(toMap(String.string, String.size));
Stream
.concat(Stream.with(*text.split()),
Stream.iterate(1, 2.times))
.parallel()
.filter(switch (it)
case (is String) it.longerThan(4)
case (is Integer) it>4)
.limit(10)
.forEachOrdered(print);
Yes, I agree that it
sits right on the border the class of "magical implicit stuff" that makes me rather suspicious, and that I've worked very, very hard to keep out of the language.
However:
it
keyword is absolutely explicit in it
s intent. foo(bar(baz(it)))
, where the "scope" of it
can only be resolved after first assigning a type to foo
, bar
, and baz
. But that's also the case with regular identifier resolution.it
pretty abuse-proof.So, we need to decide whether we should go for it
and merge this work. Can we live withoutit
? Definitely. We've been living without it
for years now. Does it
make the language nicer? It seems to me that it
does. I like the look of the code with it
. In particular, it
means that Ceylon doesn't lack something that people like in other competing languages.
@gavinking
You've never said what you think about using ":
" for indicating which function call it
belongs to.
You've never said what you think about using "
:
" for indicating which function callit
belongs to.
Well, I think it's ugly. The only motivation for adding this feature is that it makes the code look cleaner and more visually pleasing. Plus, the choice of :
would be totally adhoc here. There is no other context in the language in which :
means "function".
@Zambonifofex To be clear, we could use =>
instead, which would make much more sense, given that =>
already means "function" in the language:
value evenCubes =
LongStream
.iterate(1, => it+1)
.parallel()
.map(=> it^3)
.filter(2.divides)
.mapToObj(=> it)
.limit(20)
.collect(toList<Integer>());
But, again, all of the value of this feature is in terms of reducing visual clutter. And =>
is still clutter.
I was going to agree, until I saw this abomination:
It seems to me that
it
does.
:)
More seriously, I think this is overall a good feature. There are downsides:
Person.name
, (person)=>person.name
, and it.name
). (The second is ugly, but the first and third may both be preferable depending on context, even in the same file.)However, it's so concise and convenient, and as often as I have fun((val) => expressionInvolvingVal)
in my code, I think it would be worth it. I'd probably say the same even if the only benefit were to save me from having to type (val) =>
which requires way too much use of the shift key.
@jvasileff I think it's actually pretty nice that it.name
boils down to the same thing as Person.name
in the case where both are allowed. By which I mean, that this proposal is actually a strict superset of what @Zambonifofex proposed above.
My point is, I'd normally prefer:
persons
.sortedBy(Person.name)
.map(Person.email)
.forEach(print)
but then, if we add the filter, for consistency I might prefer:
persons
.filter(it.age >= 18)
.sortedBy(it.name)
.map(it.email)
.forEach(print)
It's another decision (another stylistic choice). But overall, less important to me than the added convenience and reduced clutter.
Alright, so the remaining wrinkle I have here is the following code:
print(["hello", "world"].each(print(it.filter(it!='l'))));
Currently this is accepted and does what is the right thing in this case. However, I'm uncomfortable with having it
refer to two different values in the same expression. So I would prefer to outlaw that. What I'm slightly unsure of is what, precisely is the right way to express that restriction?
it
s bind to the outermost function that accepts a lambda, in which case the above code would be rejected because it!='l'
is not a well-typed argument to filter()
?it
refer to the same value, in which case the above code would be rejected because it
refers to two different things?P.S. I definitely don't propose to iintroduce:
outer.it
Or:
(it of filter)
:-P
but by making
it
a keyword ...this would be only for single-expression anonymous functions, and not for blocks
I think this strikes a good balance for this feature to make a meaningful difference without introducing a lot of messy details.
I'm uncomfortable with having it refer to two different values in the same expression
How about:
it
is always bound to the most immediate/local lambda scope in which the it
is used.it
in more than one of multiple overlapping lambda scopes in an expression is disallowed. Meaning, in a nesting of overlapping lambda scopes, it
can only be used within any one of the overlapping scopes, but is a compile error as soon as it's used in a second (or more) of the overlapping scopes.While the above rules are probably on the more restrictive side, they would allow a lot of the simple, single expression use cases:
persons
.filter(it.age >= 18)
.sortedBy(it.name)
.map(it.email)
.forEach(print)
without creating much or any conflict for a potential future lifting of some of those restrictions, or expansion of the use of it
.
So, for the original expression, which would be illegal under the above rules:
print(["hello", "world"].each(print(it.filter(it!='l'))));
It would have to be re-written to use it
only in the outer each
scope:
print(["hello", "world"].each(print(it.filter((str)=>str!='l'))));
Or use it
only in the inner filter
scope:
print(["hello", "world"].each(print((str)=>str.filter(it!='l'))));
But not both.
@lucono yes, sorry, I misspoke when I said:
I'm uncomfortable with having it refer to two different values in the same expression.
Of course I'm fine with it referring to different values within nonoverlapping subexpressions. (Otherwise the feature would be pretty much useless.)
I'm turning a bit cold on this based on readability issues where it
occurs more deeply inside an auto-lambda expression due to a) lack of clarity for what it
refers to, and b) lack of obviousness that the outermost arg is auto lambdaized (lambdized?).
[1,2,3].each(print([3,4,5].filter(it.equals)))
You must 1) scan far to the right to understand what's on the left, and vice-versa, and 2) more carefully consider types, which can already be confusing at times. Disallowing all but the most shallow uses of it
would add to the awkwardness of this feature.
(My assumption is that once it
is discovered, auto-lambda-ization occurs for the outermost possible argument.)
Another example (not the best, but hopefully communicates the idea) reads ok:
persons.map((person) =>
Div {
person.children
.filter(it.age >= 18)
.map(Div(it.name))
}
)
and runs with the current code, but it's confusing to me why it works. I could have meant something like:
persons.map((it) => ((person) =>
Div {
person.children
.filter(it.age >= 18)
.map(Div(it.name))
}
))
more generally, if it
(at any depth) affects the outermost argument to a Callable
parameter (which seems reasonable), then this feature becomes unusable in complex builder expressions, which I think are an important part of Ceylon.
Finally, if it
is allowed as a regular identifier, there is an ambiguity:
variable value it = 1;
function adder(Integer x)(Integer y) => x+y;
print([1].map(adder(it))); // may print either "{ 2 }" or "{ Integer(Integer) }"
I'm turning a bit cold on this based on readability issues where
it
occurs more deeply inside an auto-lambda expression due to a) lack of clarity for whatit
refers to, and b) lack of obviousness that the outermost arg is auto lambdaized (lambdized?).
Yes, I agree. That's why the example you've just given is one of the things I'm proposing to disallow.
Finally, if
it
is allowed as a regular identifier, there is an ambiguity:
We might need to warn against use of it
as a regular identifier. However I don't want to make it an error b/c of all the existing code that would break.
Ok, perhaps I misunderstood your proposed restriction. But having it
become unusable within arguments to invocations that occur within the auto-lambda would be strange, I think:
Disallowing all but the most shallow uses of it would add to the awkwardness of this feature.
@jvasileff No, neither of the proposed fixes would do that:
- Have all
it
s bind to the outermost function that accepts a lambda, in which case the above code would be rejected becauseit!='l'
is not a well-typed argument tofilter()
?- Initially accept nested implicit lambdas, but require that every
it
refer to the same value, in which case the above code would be rejected becauseit
refers to two different things?
To be clear, your example would be disallowed by the first option, allowed by the second option.
Note that we know (I assume) that it
would be for the outermost argument to a parameter of type Callable
, but the reader still has to find where that is.
My argument is that it would be strange to disallow my example (who care's how/where we use it
in the expression), but it also creates readability challenges.
Have all its bind to the outermost function that accepts a lambda, in which case the above code would be rejected because it!='l' is not a well-typed argument to filter()?
I don't see how that would disallow:
[1,2,3].each(print([3,4,5].filter(it.equals)))
which is well typed? As in:
[1,2,3].each((Integer it) => print([3,4,5].filter(it.equals)))
FWIW, I'm not saying every will agree that this sort of thing is hard to read, just that it seems that way to me.
I don't see how that would disallow:
Oh, OK, sorry, I did not read carefully enough. No, fine, I agree that your example would be accepted.
Finally, if
it
is allowed as a regular identifier, there is an ambiguity:We might need to warn against use of
it
as a regular identifier. However I don't want to make it an error b/c of all the existing code that would break.
If we are to have this feature, I’d strongly advocate for making it
a keyword. It removes ambiguity and makes the code slightly more readable in an environment with syntax highlighting (having it
bold makes it clear that it’s not a regular identifier). (And IDEs are not the only syntax highlighters, so bolding it
based on the typechecker’s analysis is not a full replacement for a keyword IMHO.)
On the other hand, I have to admit that at least in the SDK, there seems more code using it
as identifier than I expected (as abbreviation for item
or iterator
). Perhaps we can deprecate it in 1.4 (2.0?) with the firm announcement that we will make it a keyword in 1.5 (2.1)?
@lucaswerkmeister in the current implementation it is a keyword. However it's a keyword that is allowed in identifier locations for the purpose of backward compatibility.
Ah, that’s an interesting variation. I’m fine with that, then.
I've never been a fan of things that don't express the laziness of computation visually, including things that look like expressions but are instead lazily evaluated, or in this case, wrapped into a lambda for later evaluation or no evaluation at all.
Now, in the name of following every other language in the race towards shorter more confusing programs, I guess I could say meh, why not follow others, but are we really talking about many languages here?
Is this something so important to add to justify adding yet more "there's more than one way to do it" to Ceylon?
Also, I've never understood the meaning of it
, and assumed it was short for iterated
which never made sense to me. I think the name is ill-chosen, but on the other hand if that's what everybody else is using it's going to be hard to suggest anything else, even though IMO value
springs to mind.
Alright, so the remaining wrinkle I have here is the following code:
print(["hello", "world"].each(print(it.filter(it!='l'))));
Groovy has the same "problem". I think in some pathological cases, it can get kind of difficult to track (pun intended). But usually I have solved this by introducing a new name for the outer scope:
print(["hello", "world"].each(print((word) => word.filter(it!='l'))));
I've never been a fan of things that don't express the laziness of computation visually
To throw something out there, which I may actually wind up hating...
{ ... }
indicates laziness in Ceylonit
and a parameter's typeSo, an option would be to autoclosurize expressions used to define single-element streams when assigned to single argument Callable
s using a named args invocation.
persons
.filter { it.age >= 18 }
.sortedBy { it.name }
.map { it.email }
.forEach(print);
Of course, I'm just spitballing here. I haven't thought through how this would look for more complex examples.
p.s. I suppose persons.filter { true /* it.age >= 18 */ }
could also be allowed, which would help when writing and troubleshooting code.
Groovy has the same "problem". I think in some pathological cases, it can get kind of difficult to track (pun intended). But usually I have solved this by introducing a new name for the outer scope:
print(["hello", "world"].each(print((word) => word.filter(it!='l'))));
This is similar to what I'm proposing here, but where you could instead choose to use it
in any one of the overlapping lambdas in the expression and be forced by the compiler to use an explicit name ( (word) =>
) in all the other overlapping lambdas in that expression.
This way, there could be no confusion as to which of the overlapping lambdas in the expression the it
belongs to.
where you could instead choose to use
it
in any one of the overlapping lambdas in the expression and be forced by the compiler to use an explicit name ((word) =>
) in all the other overlapping lambdas in that expression.
To be 100% clear, I agree that this is what we should do.
@jvasileff
So, an option would be to autoclosurize expressions used to define single-element streams when assigned to single argument
Callable
s using a named args invocation.
I don't hate it, honestly it looks quite reasonable. Let me think it over for a bit.
The only initial reactions I have are:
it
?", since I can still reasonably have nested {}
within a lambda, however, I agree that in practice it makes it much less likely that you'll be confused.it
and explicit forms (IDE can help).{
make it look like I could write a block of code in there, but I don't think we can possibly make that work, because it's syntactically ambiguous.So let me just throw out a different approach to reducing the clutter of short anonymous functions, one that takes advantage of something a bit special about how Ceylon declares functional parameters. Unlike other languages, we usually have not just a type, but also a name for each of the parameters of a functional parameter. So why do we force you to redeclare the name? We already let you leave the type off.
What if we let you write this:
persons
.filter(=> element.age >= 18)
.sort(=> x.lastName <=> y.lastName)
.map(=> element.name -> element.email)
.each(print)
Now, this doesn't read as nicely as it
, but that's mainly because we didn't give the parameters very good names in the Iterable
class, and we can fix that, leaving:
persons
.filter(=> it.age >= 18)
.sort(=> x.lastName <=> y.lastName)
.map(=> it.name -> it.email)
.each(print)
Or whatever.
Now, a problem that immediately arises with this is:
Where we don't have useful parameter names. But we could at least solve the unary case by just defaulting the parameter name to it
.
Anyway, it's just an idea. I'm not sure I really like the look of it. But it's at least very intellectually consistent with the rest of the language.
Hm, interesting… not sure about the look either, but I like the idea behind it very much. It gives a use to those parameter names besides documentation, it’s consistent, it lets you provide better names than it
(I think I actually prefer element
over it
:) ), and it supports functions with multiple parameters (great).
Frankly, this is much more adequate in my opinion. I also prefer element
over it
.
Now, the big downside would be that it's not the same as <insert other language name here>
. Another difference we have we'd have to justify. BUT this actually works with more than one param, no? If other languages don't support that, it could be a worthy justification for the difference.
P.S. The =>
is absolutely needed in this proposal, because we don't have the it
keyword as a way of recognizing the programmer really wanted a thunk. Earlier I argued against the clutter of =>
, but with the it
keyword it was doing very little real work, whereas here it is doing actual work. And of course it's a solution to the problem of not being able to immediately recognize the scope of the implicit thunk.
BUT this actually works with more than one param, no?
Correct. That's the point.
It gives a use to those parameter names besides documentation
Right. Well, they're not just used for documentation; they're also used by the IDE to generate nice anonymous function parameter names.
it lets you provide better names than
it
(I think I actually preferelement
overit
:)
Funny. I've always thought the it
thing was super cutesy.
I'm going to be the contrarian here, take this as you will:
I think friendly identifier names are unimportant for this feature. The identifiers will normally be used immediately, as they are in the majority of the examples, and with obvious meaning.
And, in fact, "standard" names like it
or $0
(I know there are problems with the latter) actually highlight that the identifier is implicit, so you don't have to read through a bunch of code to figure out which identifiers in an expression (or whatever) are the implicit ones.
So why do we force you to redeclare the name?
The names are not part of the type system, and are often different in method refinements.
Further, names chosen for documentation purposes which make sense for a particular function or method may be very confusing from the perspective of the caller's code.
@jvasileff Well I agree with all that. All I would say about it is that it's always up to you to write code with understandable labels. If element
makes the code less understandable, then add (person)
or whatever, to make it clearer. That will always be easy.
Anyway, it's just an idea. I'm not sure I really like the look of it.
Uff, I'm playing with this now, and it turns out I think I really like it. I think I like it much more than I predicted I would.
I liked this so much, I've pushed my implementation to master.
I'm now unreasonably fond of this feature, and frustrated I can't use it with block functions in this spark example:
get("/", => renderTodos(request));
post("/todos",
(req, res) {
todoDao.add(Todo(req.queryParams("todo-title")));
return renderTodos(req);
});
Well, I guess you can write:
get("/", => renderTodos(request));
post("/todos",
function {
todoDao.add(Todo(request.queryParams("todo-title")));
return renderTodos(request);
});
A very common scenario in many projects and libraries is one involving functions which accept a single argument, where the single argument is a zero- or single-argument function or lambda. In these cases, it would be useful to have a syntax which allows the body of the lambda directly in the function call's braces without needing to wrap the lambda with additional curly braces, improving readability quite a bit.
The syntax would need to be able to disambiguate this usage from other possible usages of the named argument function invocation syntax.
To invent a fictional implementation as an example, one might be similar to the invoke operator but where the opening curly brace of the named argument invocation syntax is immediately followed by a colon, resulting in a new function invocation operator which opens with
{:
as an indicator that the invocation braces directly contain the body (i.e. not wrapped in additional curly braces) of the single lambda argument being supplied.Basically:
it
) only available inside the lambda could be supported for accessing its single argument, and the need for an explicit parameter list relaxed. (Nested lambdas that would create nestedit
s or shadowedit
s could be disallowed).;
at the end of the last statement or expression within the lambda block, as the ambiguity with other possible usages of the name argument syntax would not exist when using the new invoke operator. This will particularly eliminate the noise of the terminating semi-colon;
for lambdas with a single statement or expression.{
and the colon:
, surrounded by parentheses.