ceylon / ceylon-compiler

DEPRECATED
GNU General Public License v2.0
138 stars 36 forks source link

special handling for Java Iterable #403

Open gavinking opened 12 years ago

gavinking commented 12 years ago

I think we're going to need to do the following:

  1. automagically transform any occurrence of java.lang.Iterable to Ceylon Iterable in the model loader, and
  2. generate code to transform between these two types at runtime.

Otherwise you just won't be able to conveniently iterate over Java collections using for.

Thoughts?

quintesse commented 12 years ago

Why not just make for accept both types of Iterator? I mean, is there a reason it would have to be a Java iterator wrapped in a Ceylon iterator for example?

chochos commented 12 years ago

The Iterator behavior is quite different. Ceylon iterators only have a next method and return exhausted when they're done (and keep returning that value indefinitely if you keep calling next). Java iterators have a hasNext and throw an exception if you call next after they have returned the last element.

I think it would be easier to have a special java.util.Iterator that wraps a Ceylon Iterator and a special Ceylon Iterator that wraps a java.util.Iterator. The only problem would be when passing an iterator back and forth between Ceylon and Java code (you could end up with an onion-like iterator) but maybe that can be solved at compile-time by checking if the wrapping really needs to be done (wrap/unwrap as needed).

quintesse commented 12 years ago

No what I mean is , the code that generates the for-loop could just handle both iterators because it doesn't need to do anything special, just two slightly different ways of testing the end. But if we want to treat iterators always as interchangeable, well that's different of course.

chochos commented 12 years ago

That only works for for loops in Ceylon, but it doesn't do anything for Java code that receives a Ceylon iterator when it's expecting a Java one. On the Ceylon side, code expecting a Ceylon iterator will break when it gets a Java iterator at runtime (since it never gets an exhausted but rather an exception is thrown).

FroMage commented 12 years ago

Well… It's not that simple to automagically pretend j.l.Iterable is c.l.Iterable. So for example j.l.List<T> will satisfy c.l.Iterable, which will hide the real implementation of j.l.Iterable, so suddenly there will be a lot of boxing, and special cases for is, plus more boxing to get the iterator.

Because we'll also have to do magic for j.u.Iterator and c.l.Iterator.

Not sure I like this. I'd rather have for support both by generating different code, and provide conversion functions.

gavinking commented 12 years ago

Why not just make for accept both types of Iterator?

Because then the definition of for in the language specification would depend upon a type which is not defined in ceylon.language.

quintesse commented 12 years ago

I don't see how this would be different than silently wrapping a Java iterator in a Ceylon iterator.

gavinking commented 12 years ago

I don't see how this would be different than silently wrapping a Java iterator in a Ceylon iterator.

Very different. Two totally different levels of abstraction. One is part of the language definition. One is part of the interop layer.

quintesse commented 12 years ago

No, I mean: what's the difference? Either I silently generate a wrapper or I make the Java backend generate special code for Java iterators. In either case the result is the same and I don't see why one case has to be mentioned in the spec and the other doesn't.

gavinking commented 12 years ago

Because the typechecker has to know that

for (x in javaIterable) { ... }

is well-typed.

quintesse commented 12 years ago

Ok, I still think we could do model loader tricks, the same as we "pretend" that a Java String really is a Ceylon String, but I give up, wrappers it is.

FroMage commented 12 years ago

Does it make any difference that String is final and j.u.Iterator is an interface? If we do model loader tricks I mean.

RossTate commented 12 years ago

I vote for c.l.Iterable to extend j.l.Iterable and provide an unoverrideable impementation of its iterator() method that simply wraps the result of the Ceylon iterator attribute. Then make it so that our for can accept a j.l.Iterable but optimize it when we know it's a c.l.Iterable. (Note that I'm not saying c.l.Iterator should extend j.l.Iterator, nor am I saying either should be implicitly coerced/wrapped into the other in general, only in the implementation of iterator().)

Now, I'm guessing Gavin's gonna retort with the fact that this means Ceylon's for has to have knowledge of Java's j.l.Iterable. That's true if we're writing Ceylon code that interops with Java. However, for any Ceylon code that is isolated, the for only has to know about our c.l.Iterable. You can think of this as a language extension that exists only when compiling to Java. If we're compiling to JavaScript, then there will be no mention of j.l.Iterable in our class hierarchy or our type system.

Note that we'll probably run into similar issues when interoping with other languages. We'll probably need to make similar language extensions that are transparent for non-interoping code.

FroMage commented 12 years ago

Guys, I got to the point where for supports Java Iterable, so that's one thing easy to do.

Now, erasing both Java's Iterable and Iterator to Ceylon's is a much harder problem. I just thought of the following example:


void iterableCasting(){
    JList<Integer> l = JArrayList<Integer>();
    Iterable<Integer> iterable = l;
    if(is JList<Integer> l2 = iterable){
        print((l2 === l).string);
    }
}

I would expect l, l2 and iterator to hold the same reference myself, but with boxing they won't. Now for some reason the type checker is telling me that java.util.List is not an IdentifiableObject but that's a bit weird, right? That must be a model loader bug, no?

FroMage commented 12 years ago

Mmm, it looks like interfaces extend Object and not IdentifiableObject, which means you can't do === with interface types. Sounds a bit extreme, no?

Still I can rewrite the example below as:

void iterableCasting(){
    JArrayList<Integer> l = JArrayList<Integer>();
    Iterable<Integer> iterable = l;
    if(is JArrayList<Integer> l2 = iterable){
        print((l2 === l).string);
    }
}
RossTate commented 12 years ago

There are many issues with implicit wrappings. For example, try to get the following to work:

Sequence<T> singleton<T>(T t) { return {t}; }
JIterable<Integer> itr = ...;
return sameSums(singleton(itr)) && JsameSums(singleton(itr));

where Boolean sameSums(CIterable<CIterable<Integer>> intBags) is in Ceylon and boolean JsameSame(JIterable<? extends JIterable<? extends Integer>> intBags) is imported from Java.

This is tricky because you have to look at how the returned value of singleton(itr) is being used to figure out whether or not you should wrap itr into a CIterable before passing it to singleton.

Implicit coercion is one of those features which sounds simple but does not mix well with other features.

gavinking commented 12 years ago

Irrelevant because this would not be an implicit conversion or wrapping.

Sent from my iPhone

On Mar 1, 2012, at 1:07 PM, Ross Tatereply@reply.github.com wrote:

There are many issues with implicit wrappings. For example, try to get the following to work:

Sequence<T> singleton<T>(T t) { return {t}; }
JIterable<Integer> itr = ...;
return sameSums(singleton(itr)) && JsameSums(singleton(itr));

where Boolean sameSums(CIterable<CIterable<Integer>> intBags) is in Ceylon and boolean JsameSame(JIterable<? extends JIterable<? extends Integer>> intBags) is imported from Java.

This is tricky because you have to look at how the returned value of singleton(itr) is being used to figure out whether or not you should wrap itr into a CIterable before passing it to singleton.

Implicit coercion is one of those features which sounds simple but does not mix well with other features.


Reply to this email directly or view it on GitHub: https://github.com/ceylon/ceylon-compiler/issues/403#issuecomment-4267939

RossTate commented 12 years ago

Huh?

I can make it work without any fancy features just by inserting CIterable<T> wrap<T>(JIterable<T>):

Sequence<T> singleton<T>(T t) { return {t}; }
JIterable<Integer> itr = ...;
return sameSums(singleton(wrap(itr))) && JsameSums(singleton(itr));

so I don't know what you mean.

gavinking commented 12 years ago

Ok, I still think we could do model loader tricks, the same as we "pretend" that a Java String really is a Ceylon String, but I give up

Huh? Isn't that exactly what I'm proposing?

gavinking commented 12 years ago

I vote for c.l.Iterable to extend j.l.Iterable

I think this is just an awful idea.

Now, I'm guessing Gavin's gonna retort with the fact that this means Ceylon's for has to have knowledge of Java's j.l.Iterable.

No, I'm going to retort that this means that ceylon.language has a dependency to the monolithic language module of a completely different language that is not even available on all platforms. So on some platforms c.l.Iterator has additional operations it does not have on others. From a philosophical and practical point of view that is just crazy!

FroMage commented 12 years ago

@gavinking what about the problem I raised in https://github.com/ceylon/ceylon-compiler/issues/403#issuecomment-4261681 and the next comment? The thing is that autoboxing like we do for non-IdentifiableObjects is fine (String, Number…) but in the case of Iterator and Iterable it breaks identity.

RossTate commented 12 years ago

Okay, so say you're writing a library, and you've decided you want to write it in Ceylon cuz it's this new cool language. However, you want your library to accessible, and Ceylon's not that big yet, so you want it to also be able to interact with Java code. As such, you want your classes to implement appropriate Java interfaces only when interoping to Java. You may even realize you're library would be useful for JavaScript, so you want your classes to implement appropriate JavaScript methods only when interoping with JavaScript. Otherwise, when you're interoping with just plain Ceylon code, then you don't care if those interfaces or methods are present (in fact you don't want them to be present).

This seems like something that may be common in Ceylon's early stages. In fact, it's already happening. We have our own collection library, and we want Java code to be able to use our collections as they would standard Java collections. However, as you say, we don't want to have a dependency on Java when we're not trying to interop with Java. So, I'm trying to propose a way to have Java compatibility without Java dependency. I'm hoping this solution would not only solve our problems, but future Ceylon-developers problems as well, while avoiding the hackiness of implicit coercions.

gavinking commented 12 years ago

@RossTate To me that is really well beyond the goals we have for Java interop. It is explicitly a non-goal to be able to write libraries in Ceylon for other-JVM-language audiences. If that were a goal, it would call for a completely different design of the language module.

gavinking commented 12 years ago

@FroMage a worse problem is:

Iterable<Object> i = ArrayList<Object>();
if (is ArrayList<Object> i) { ... }

So wrapping is truly problematic. But I think that suggests a solution.

What we could do is have the Java impl of c.l.Iterable extend j.l.Iterable and get the compiler to automatically filll in the implementation of iterator() on classes that satisfy it. Then we lower the type c.l.Iterable to j.l.Iterable everywhere in all client code. WDYT?

Note that this solution still calls for the model-loader magic on Java iterables, and also requires special handling of the iterator attribute of c.l.Iterable.

FroMage commented 12 years ago

I thought about it a bit and ended up with this pseudo code:

interface CeylonIterable<T> extends java.lang.Iterable<T>{
    boolean getEmpty();
    CeylonIterator<T> getIterator();
}

interface CeylonIterator<T> extends java.util.Iterator<T>{
    Object getNext();
}

class CeylonIterableImpl implements CeylonIterable<String>{

    @Override
    public java.util.Iterator<String> iterator() {
        return getIterator();
    }

    @Override
    public boolean getEmpty() {
        return false;
    }

    @Override
    public CeylonIterator<String> getIterator() {
        return null;
    }

}

class CeylonIteratorImpl implements CeylonIterator<String>{

    Object cachedNext;
    boolean isNextCached = false;

    @Override
    public boolean hasNext() {
        // get the next one to see what happens
        if(!isNextCached){
            cachedNext = $userNext();
            isNextCached = true;
        }
        return cachedNext == ceylon.language.exhausted.getExhausted();
    }

    @Override
    public String next() {
        Object next = getNext();
        if(next == ceylon.language.exhausted.getExhausted())
            throw new NoSuchElementException();
        return (String)next;
    }

    @Override
    public void remove() {
        throw new UnsupportedOperationException();
    }

    @Override
    public Object getNext() {
        Object next;
        // make sure we use the cache if it's there
        if(isNextCached){
            next = cachedNext;
            isNextCached = false;
            cachedNext = null;
        }else
            next = $userNext();
        return next;
    }

    private Object $userNext() {
        return null;
    }

}

You can see that:

  1. If we want to make j.l.Iterable extend c.l.Iterable we are going to have to figure out how to provide an empty attribute.
  2. Our implementations of c.l.Iterator will have annoying stuff to deal with Java's hasNext(), such as moving the user code for getNext() into $userNext().
  3. We effectively have to reserve the hasNext, next, remove and iterator method names like we do for Java's hashCode and toString.
  4. We don't have anything like Java's Iterator.remove, which is another problem, but one that we should fix, because it's very useful in Java.

I have a feeling there are other issues, but I can't quite put my hand on them yet, I'll try again tomorrow.

RossTate commented 12 years ago

I wouldn't have Ceylon's iterator extend Java's iterator. It's not necessary for the for syntax in either language, and it adds memory overhead that's completely useless when using Ceylon iterators as Ceylon iterators (which ideally is the common case).

gavinking commented 12 years ago

Agreed, I don't think we need Ceylon iterators to be Java iterators.

Sent from my iPhone

On Mar 8, 2012, at 1:11 PM, Ross Tatereply@reply.github.com wrote:

I wouldn't have Ceylon's iterator extend Java's iterator. It's not necessary for the for syntax in either language, and it adds memory overhead that's completely useless when using Ceylon iterators as Ceylon iterators (which ideally is the common case).


Reply to this email directly or view it on GitHub: https://github.com/ceylon/ceylon-compiler/issues/403#issuecomment-4397583

FroMage commented 12 years ago

How can the Java impl of c.l.Iterable extend j.l.Iterable without doing the same for Iterator?

FroMage commented 12 years ago

Right, so one other issue is this:

Container i = ArrayList<Object>();
if (is ArrayList<Object> i) { ... }

So it looks like we have to do some more magic on Container, or drop it from Iterable.

FroMage commented 12 years ago

I think we can make this work if we:

  1. Remove Container from c.l.Iterable
  2. Remove empty from c.l.Iterable
  3. Lower every mention of c.l.Iterable into j.l.Iterable except for satisfies clauses
  4. Make the Java impl of c.l.Iterable implement j.l.Iterable
  5. Add a generated j.l.Iterable.iterator() method for each class implementation of c.l.Iterable.iterator, which will call a utility method which wraps a c.l.Iterator as a j.u.Iterator
  6. Add a runtime type test for each call to c.l.Iterable.iterator since the underlying object may be of type j.l.Iterable and not have this property
  7. Add a special-case for is c.l.Iterable<T>
  8. Make the model-loader pretend that j.l.Iterable is c.l.Iterable
  9. Reserve and escape the iterator method name
  10. We lose the ability to call j.l.Iterable.iterator() since that doesn't exist in Ceylon-land anymore, which implies that we can't get access to any j.u.Iterator and we lose its remove() method.
  11. is j.l.Iterable<T> will likely return true for c.l.Iterable<T>
  12. for will have to use the j.u.Iterator to iterate, since we can't know at compile-time whether something really is c.l.Iterable or j.l.Iterable, which means wrappers even for Ceylon iterables. And this sucks.

At this point, is this really worth the trouble?

quintesse commented 12 years ago

I don't like at all where this is going. We don't have a clear idea how the heck we could (reasonably) implement this. So as long as the proponents of this idea don't come up with a clear solution on how to handle this, I'd say forget about all of this. Just make for understand j.l.Iterable, generating specific code to handle it. In all other cases you'll just have to use the Java iterator directly or use a wrapper class (which we could provide). If in the future we notice that we're using an awful amount of those wrappers everywhere we can always revisit this issue after we've had some more time to think about it.

FroMage commented 12 years ago

I'm with @quintesse here. This sounds much safer. If we don't want to make for iterate over java.lang.Iterable then let it rely on wrappers:

List<String> l = ArrayList<String>();
for(String s in toIterable(l)){
}

I'm not sure we can optimise this solution because the minute we lose the info of the type of Iterable (Ceylon or Java) at compile-time, we can't produce an adequate for loop. We certainly can't produce a single for loop that works for both types of Iterable at runtime since that would require lots of instanceof. Only allowing for to work with both types of Iterable will allow us to optimise the wrappers away.

I don't think it's unreasonable to say in the spec that for can iterate over platform-dependent iterables as well as the Ceylon one. Hell if we can say that the size of Integer is platform-dependent we can do that too. This would let us do the same for JavaScript if needed, WDYT @chochos?

@quintesse: BTW, how does this relate to the issue of being able to iterate over arrays efficiently?

chochos commented 12 years ago

Javascript doesn't even have the concept of iterators, so we only need to deal with c.l.Iterator; we have several (mostly private) Iterator implementations that are used by each collection (RangeIterator, ArraySequenceIterator, SingletonIterator, etc). for in js is implemented like this:

var tmpvar1=whatever.getIterator();
var tmpvar2;
while ((tmpvar2=tmpvar1.next()) !== exhausted) { ... }
//and if there's an else then we have this after the loop
if (tmpvar2 === finished) { ... }
quintesse commented 12 years ago

I was suggesting to just allow Java Iterables and generate specific code for them, so the above code would simply be:

List<String> l = ArrayList<String>();
for (String s in l) {
}

This specific case, which will most likely be the most common, will just be handled transparently.

It won't work when trying to pass Iterables around as parameters, but when are you going to do that? I mean, Java collections and Ceylon collections are not compatible anyway, so the only thing I can imagine is that you have code that can handle both Java en Ceylon collections and want to iterate over them in a generic way. But why is that useful if almost anything else you're going to do with those collections will need separate code anyway?

@FroMage: the same thing more or less, if you do:

Array<String> arr = array("aap", "noot", "mies");
for (String s in arr) {
}

the code generated for the for-loop will be specific for that case and could be something like:

Array<String> arr = ....
for (int $tmp = 0; $tmp < arr.length; $tmp++) {
    final String s = arr[$tmp];
}

(of course, like we already discussed on chat, we will not be able to do any of this with the current design of Array)

FroMage commented 12 years ago

Javascript doesn't even have the concept of iterators

Well, no but what about JavaScript's Array type? How do you iterate over it in Ceylon/JS?

FroMage commented 12 years ago

the code generated for the for-loop will be specific for that case

So if you cast the Array<String> into an Iterable<String> you get the same issue of it being impossible to optimise, right?

chochos commented 12 years ago

Well, no but what about JavaScript's Array type? How do you iterate over it in Ceylon/JS?

You don't... interop in ceylon-js is necessarily one-way only (you can call Ceylon-generated JS from vanilla JS from you can't call JS code from Ceylon). If you want to pass a js array to Ceylon, you can wrap it in an ArraySequence:

var vanilla_array=[ blabla ];
someCeylonObject.methodExpectingSequence($$$cl15.ArraySequence(vanilla_array));
quintesse commented 12 years ago

interop in ceylon-js is necessarily one-way only

You say that but I'm not sure I agree, I'm pretty sure we'll need some kind of way to call into native JS to be able to do more interesting things. Like for example access to the HTML DOM model inside a browser (see this Kotlin demo for example http://kotlin-demo.jetbrains.com/?folder=Canvas&name=Hello,_Kotlin)

chochos commented 12 years ago

Access to the DOM model is one thing, and we've already said it's going to be quite problematic; but having a full way to call JS code from Ceylon code, I think that's another thing entirely, and if you look at it from the Ceylon side, it's the same as having interop with ruby or any other language that doesn't even reside in the JVM.

chochos commented 12 years ago

mmmm magic functions getContext() and getCanvas(), I wonder how those are implemented.

quintesse commented 12 years ago

Well I understand that we can't just call any JS code that happens to exist, but not even if it's available as a CommonJS module?

quintesse commented 12 years ago

@FroMage sure, but I don't see any way around that, the moment you pass an Array along to a method taking an Iterable it could be anything, including stuff that can't be optimized. If you want you can have something like this, which is pretty ugly I must admit:

void loop(Iterable<T> i) {
    if (is Array<T> i) {
        for (T t in i) { ... }
    } else {
        for (T t in i) { ... }
    }
}

and this would actually generate two different for-loops.... bit icky, isn't it?

NB: in "our" idea of handling Arrays (with type-specific subtypes) we could make the iterator much more efficient because we can get rid of a lot of boxing. Still, not as good as a plain old for (int i = 0; i < array.length; i++)

chochos commented 12 years ago

The problem is not whether the JS code exists as a CommonJS module; the problem is that at compile time, we just don't know if that code exists, what it does, what methods it responds to, etc. No way to validate that...

quintesse commented 12 years ago

Really? I thought that was what CommenJS was for, that it has this "metadata" where you can ask for dependencies and which "methods" are exported. But yeah, the problem of course always is: what are the parameters and returns values "made of". Which would bring us back to the whole discussion of dynamic or unchecked dispatch and this is not the thread for that. Sorry I brought it up here.

FroMage commented 12 years ago

This is likely to slip to M3, removing high prio.

FroMage commented 12 years ago

Slips to M3.

FroMage commented 11 years ago

Mmm, I still have lots of code for this lying around in my branch. Do we need to revive this?

quintesse commented 11 years ago

Not sure, what is it you have right now?

FroMage commented 11 years ago

That the for loop works for Java Iterable types.

FroMage commented 11 years ago

Reassigned to @gavinking to see if he wants for loop support for java iterables or not. I think this will slip to M5.