Closed gavinking closed 12 years ago
I've gotta get to bed unfortunately, so for now I'll just copy what I said to Gavin a little earlier:
There are three fairly orthogonal aspects to this:
I like it. So basically for
can now accept a single expression? And then { for (p in people) p.name }
would be just sugar for p.project((Person p) p.name)
? The second one is a little more complicated, since you'll have to detect the if
to know that it's a filter and transform the whole thing into p.select((Person p) p.age>=18).project((Person p) p.name->p.age)
or something like that.
Just a nitpick about "semantics of comprehensions".
Consider for (x in xs) if (s(x)) p(x)
for example. The syntax is a bit ambiguous. To me, it looks like you can support not only selection/projection but any type of expressions such as:
for (x in xs) if (s(x)) p(x) else q(x)
Although, It would be great to support it, I think a Python-like syntax for this would be a bit more clear for selection/projection:
p(x) for (x in xs) if (s(x))
or
value map = HashMap { p.name->p.age for (p in people) if (p.age>=18) };
@soheilhassasyeganeh Well, we did consider that over in #82, and in certain ways I find it more readable, but the argument goes that throughout the rest of the language you have to declare a variable before you can reference it, and that we should stick with the same convention here.
@gavinking sorry I didn't read #82. The convention makes sense.
I suppose select()
and project()
should be instance methods of Iterable
with default
implementations, allowing customized refinements that avoid creating an Iterator
.
In case you don't already know this, the standard names for those operations (at least that I've always seen) are filter
and map
. Of course, I've mostly seen these in the functional-languages community, so there may be different names in the OO community. Actually, looking online, C# uses Where
and Select
. Oy, naming conventions...
Ross, I'm using the standard terminology for these operations from relational algebra (yes, we'll also need a product/join operation).
I think project
is only used in that context for projecting components of tuples, not for applying arbitrary functions. That is, it's a specialization of map
applied to the project function for tuples.
I think
project
is only used in that context for projecting components of tuples, not for applying arbitrary functions.
That might be strictly true, but in informal industry usage I've always seen it used pretty much synonymously with "stuff you can do in the SELECT
clause" ;-)
FTR, in SmallTalk they're called select:
and collect:
. I'm also quite a fan of those names. Definitely prefer them to map()
/ reduce()
or map()
/ filter()
.
Regarding the question on naming, 50 years of functional programming has etched the terms map
, fold
, filter
into stone such that even languages like JavaScript, Perl or Java are using it. Using anything else is trying to make it harder for people to get into Ceylon for no good reason.
I am not a big fan of the syntax for comprehensions because:
for
constructs for essentially different features, and if we ever allow for
expressions we'll get three and I think the only one people don't really expect is the comprehension oneif
people[].name
, which is a simplistic use-case but quite intuitive syntaxp.filter((Person p) p.age>=18).map((Person p) p.name->p.age)
and at some point we can't introduce syntax to shorten every possible library call.I have the feeling that for comprehensions are just a Perl-ism, trying to make things shorter using ad-hoc syntax, when there's already another way to do it.
+1 on putting operations like filter
and map
(and a shitload of others) on Iterable, like Java 8.
Agree with @FroMage here that map
, fold
and filter
are used in many languages and are the terms that people use when talking about these things, on the other hand almost no Java Enterprise Programmer will know SmallTalk (if they even heard of it).
Also agree with the library calls. Building these on Iterable
so we can make them lazy and chainable will give us all we need. If the syntax is not "short" enough we might need to take a look at the anonymous function expressions instead.
Totally disagree. Look how many times you had to declare what p
is in:
people.filter((Person p) p.age>=18).map((Person p) p.name->p.age)
Compared to:
{ from (p in people) if (p.age>18) p.name->p.age }
Now imagine that the type of p
is something more complex that Person
. Say, Entry<String,List<Integer>>
.
Calling an ancient construct from mathematics a "Perl-ism" is really quite a stretch.
What I called a Perl-ism is "trying to make things shorter using ad-hoc syntax, when there's already another way to do it."
Well the declaration of p
is exactly what could be said is a disadvantage of the current anonymous function syntax. If the type of p
could have been determined from its use than it would be shorter, something like:
people.filter((p) p.age>=18).map((p) p.name->p.age)
It's a pity that it's not possible right now, you're sure that it's just not doable in Ceylon, right?
Well the declaration of
p
is exactly what could be said is a disadvantage of the current anonymous function syntax
This has nothing to do with syntax it has to do with having a well-defined typesystem with rules that are easy to understand.
It's a pity that it's not possible right now, you're sure that it's just not doable in Ceylon, right?
Oh all kinds of things are doable. Indeed you can easily hack in just about any kind of ad hoc type inference rules you like into the compiler. The question is whether it's desirable to do so.
Heh, I should really start typing funky inference examples into C#, Java, and Scala just to see what happens. I wonder if I can still make any of them crash...
OK, support for comprehensions is now implemented in the comprehensions
branch of the typechecker and language module. Check it out - it's actually cool as hell!
So it looks like the idea of just naively trying to map comprehensions onto map()
and filter()
doesn't work out after all. The problem is that products like this don't map naturally to a higher-order function:
for (x in 1..12) for (y in 1..12) x*y
Now, I think with sequenced type parameters, it is possible to write down the signature of product()
but I seriously don't want to go there. So I think a comprehension needs to be desugared to an inner class that satisfies Iterable
. I just need to work out the details of this mapping.
Given a comprehension like this:
for (y in ys) for (x in xs) if (x>y) x*y
I think we'll have to generate Java code like this:
new Iterable<Integer>() {
TupleIterator $i = new TupleIterator(ys, xs);
@Override
java.lang.Object next() {
while ($i.hasNext()) {
Object[] $a = $i.next();
Integer x = $a[0];
Integer y = $a[1];
if (x>y) return x*y;
}
return exhausted;
}
}
where TupleIterator
is a class that knows how to form the product of iterators.
Here's a much more efficient version:
new Iterable<Integer>() {
Iterable<Integer> $ys = ys.getIterator();
Iterable<Integer> $xs = xs.getIterator();
Integer y;
Integer x;
boolean $nexty() {
y = $ys.next();
if (y instanceof Finished) {
return false;
}
return true;
}
boolean $nextx() {
x = $xs.next();
if (x instanceof Finished) {
if ($nexty()) {
$xs = xs.getIterator();
x = $xs.next();
return !(x instanceof Finished);
}
return false;
}
return true;
}
@Override
java.lang.Object next() {
while ($nextx()) {
if (x>y) return x*y;
}
return exhausted;
}
}
Your implementation generalizes into this structure:
new Iterator<T>() {
// Iterators
// Context
...
T|Finished next() {
if ($nextContext()) {
return $evalExpression();
} else {
return exhausted;
}
}
}
Note that I incorporate the if
filter into $nextContext
. Essentially each nested for
and if
corresponds to a nesting within the implementation of $nextContext
. This alteration enables for (x in 1..12) if (x % 2 == 0) for (y in 1..12) x*y
.
Also, the reason that map
and filter
are not enough is that you have nested for
s, which requires monadic structure to resolve, either via bind or flatten
. Nonetheless, the above structure should be more efficient.
Closing, this was implemented in m3.
Overview
This proposal defines uniform semantics for comprehensions and sequenced arguments to sequenced parameters and sequence enumerations. This is based upon the observation that the following syntax (a sequence enumeration)
is at some level just an abbreviation for this (a named argument invocation)
even if that's not precisely how it's defined in the spec and treated in the type checker.
Therefore this proposal allows either of these constructs to specify the "argument" to a sequenced parameter in one of three different ways:
1, 2, 3
,people...
, orfor (p in people) p.name
.The proposal comes in two parts.
Change to the semantics of sequenced parameters
Currently a sequenced parameter
T...
is considered to have the typeT[]
. This is somewhat convenient, but it doesn't work out great for processing large streams of values which we might want to access lazily. Therefore I propose to say that a sequenced parameter has the typeIterable<T>
. To allow convenient access to the sequenced parameter as a sequence, we can just support the spread syntax for sequence enumerations, for example:Of course, we also need to redefine the spread syntax to accept
Iterable<T>
instead ofT[]
. This anyway works out much better for transforming between different collection types, for example:Semantics of comprehensions
We can now define the following syntax:
as an abbreviation for:
or perhaps:
if we define these functions as methods on
Iterable
.Now we can write: