eclipse-archived / ceylon

The Ceylon compiler, language module, and command line tools
http://ceylon-lang.org
Apache License 2.0
397 stars 62 forks source link

special handling for Java Iterable #403

Closed CeylonMigrationBot closed 8 years ago

CeylonMigrationBot commented 12 years ago

[@gavinking] I think we're going to need to do the following:

  1. automagically transform any occurrence of java.lang.Iterable to Ceylon Iterable in the model loader, and
  2. generate code to transform between these two types at runtime.

Otherwise you just won't be able to conveniently iterate over Java collections using for.

Thoughts?

[Migrated from ceylon/ceylon-compiler#403]

CeylonMigrationBot commented 12 years ago

[@quintesse] Why not just make for accept both types of Iterator? I mean, is there a reason it would have to be a Java iterator wrapped in a Ceylon iterator for example?

CeylonMigrationBot commented 12 years ago

[@chochos] The Iterator behavior is quite different. Ceylon iterators only have a next method and return exhausted when they're done (and keep returning that value indefinitely if you keep calling next). Java iterators have a hasNext and throw an exception if you call next after they have returned the last element.

I think it would be easier to have a special java.util.Iterator that wraps a Ceylon Iterator and a special Ceylon Iterator that wraps a java.util.Iterator. The only problem would be when passing an iterator back and forth between Ceylon and Java code (you could end up with an onion-like iterator) but maybe that can be solved at compile-time by checking if the wrapping really needs to be done (wrap/unwrap as needed).

CeylonMigrationBot commented 12 years ago

[@quintesse] No what I mean is , the code that generates the for-loop could just handle both iterators because it doesn't need to do anything special, just two slightly different ways of testing the end. But if we want to treat iterators always as interchangeable, well that's different of course.

CeylonMigrationBot commented 12 years ago

[@chochos] That only works for for loops in Ceylon, but it doesn't do anything for Java code that receives a Ceylon iterator when it's expecting a Java one. On the Ceylon side, code expecting a Ceylon iterator will break when it gets a Java iterator at runtime (since it never gets an exhausted but rather an exception is thrown).

CeylonMigrationBot commented 12 years ago

[@FroMage] Well… It's not that simple to automagically pretend j.l.Iterable is c.l.Iterable. So for example j.l.List<T> will satisfy c.l.Iterable, which will hide the real implementation of j.l.Iterable, so suddenly there will be a lot of boxing, and special cases for is, plus more boxing to get the iterator.

Because we'll also have to do magic for j.u.Iterator and c.l.Iterator.

Not sure I like this. I'd rather have for support both by generating different code, and provide conversion functions.

CeylonMigrationBot commented 12 years ago

[@gavinking]

Why not just make for accept both types of Iterator?

Because then the definition of for in the language specification would depend upon a type which is not defined in ceylon.language.

CeylonMigrationBot commented 12 years ago

[@quintesse] I don't see how this would be different than silently wrapping a Java iterator in a Ceylon iterator.

CeylonMigrationBot commented 12 years ago

[@gavinking]

I don't see how this would be different than silently wrapping a Java iterator in a Ceylon iterator.

Very different. Two totally different levels of abstraction. One is part of the language definition. One is part of the interop layer.

CeylonMigrationBot commented 12 years ago

[@quintesse] No, I mean: what's the difference? Either I silently generate a wrapper or I make the Java backend generate special code for Java iterators. In either case the result is the same and I don't see why one case has to be mentioned in the spec and the other doesn't.

CeylonMigrationBot commented 12 years ago

[@gavinking] Because the typechecker has to know that

for (x in javaIterable) { ... }

is well-typed.

CeylonMigrationBot commented 12 years ago

[@quintesse] Ok, I still think we could do model loader tricks, the same as we "pretend" that a Java String really is a Ceylon String, but I give up, wrappers it is.

CeylonMigrationBot commented 12 years ago

[@FroMage] Does it make any difference that String is final and j.u.Iterator is an interface? If we do model loader tricks I mean.

CeylonMigrationBot commented 12 years ago

[@RossTate] I vote for c.l.Iterable to extend j.l.Iterable and provide an unoverrideable impementation of its iterator() method that simply wraps the result of the Ceylon iterator attribute. Then make it so that our for can accept a j.l.Iterable but optimize it when we know it's a c.l.Iterable. (Note that I'm not saying c.l.Iterator should extend j.l.Iterator, nor am I saying either should be implicitly coerced/wrapped into the other in general, only in the implementation of iterator().)

Now, I'm guessing Gavin's gonna retort with the fact that this means Ceylon's for has to have knowledge of Java's j.l.Iterable. That's true if we're writing Ceylon code that interops with Java. However, for any Ceylon code that is isolated, the for only has to know about our c.l.Iterable. You can think of this as a language extension that exists only when compiling to Java. If we're compiling to JavaScript, then there will be no mention of j.l.Iterable in our class hierarchy or our type system.

Note that we'll probably run into similar issues when interoping with other languages. We'll probably need to make similar language extensions that are transparent for non-interoping code.

CeylonMigrationBot commented 12 years ago

[@FroMage] Guys, I got to the point where for supports Java Iterable, so that's one thing easy to do.

Now, erasing both Java's Iterable and Iterator to Ceylon's is a much harder problem. I just thought of the following example:


void iterableCasting(){
    JList<Integer> l = JArrayList<Integer>();
    Iterable<Integer> iterable = l;
    if(is JList<Integer> l2 = iterable){
        print((l2 === l).string);
    }
}

I would expect l, l2 and iterator to hold the same reference myself, but with boxing they won't. Now for some reason the type checker is telling me that java.util.List is not an IdentifiableObject but that's a bit weird, right? That must be a model loader bug, no?

CeylonMigrationBot commented 12 years ago

[@FroMage] Mmm, it looks like interfaces extend Object and not IdentifiableObject, which means you can't do === with interface types. Sounds a bit extreme, no?

Still I can rewrite the example below as:

void iterableCasting(){
    JArrayList<Integer> l = JArrayList<Integer>();
    Iterable<Integer> iterable = l;
    if(is JArrayList<Integer> l2 = iterable){
        print((l2 === l).string);
    }
}
CeylonMigrationBot commented 12 years ago

[@RossTate] There are many issues with implicit wrappings. For example, try to get the following to work:

Sequence<T> singleton<T>(T t) { return {t}; }
JIterable<Integer> itr = ...;
return sameSums(singleton(itr)) && JsameSums(singleton(itr));

where Boolean sameSums(CIterable<CIterable<Integer>> intBags) is in Ceylon and boolean JsameSame(JIterable<? extends JIterable<? extends Integer>> intBags) is imported from Java.

This is tricky because you have to look at how the returned value of singleton(itr) is being used to figure out whether or not you should wrap itr into a CIterable before passing it to singleton.

Implicit coercion is one of those features which sounds simple but does not mix well with other features.

CeylonMigrationBot commented 12 years ago

[@gavinking] Irrelevant because this would not be an implicit conversion or wrapping.

Sent from my iPhone

On Mar 1, 2012, at 1:07 PM, Ross Tatereply@reply.github.com wrote:

There are many issues with implicit wrappings. For example, try to get the following to work:

Sequence<T> singleton<T>(T t) { return {t}; }
JIterable<Integer> itr = ...;
return sameSums(singleton(itr)) && JsameSums(singleton(itr));

where Boolean sameSums(CIterable<CIterable<Integer>> intBags) is in Ceylon and boolean JsameSame(JIterable<? extends JIterable<? extends Integer>> intBags) is imported from Java.

This is tricky because you have to look at how the returned value of singleton(itr) is being used to figure out whether or not you should wrap itr into a CIterable before passing it to singleton.

Implicit coercion is one of those features which sounds simple but does not mix well with other features.


Reply to this email directly or view it on GitHub:

403#issuecomment-4267939

CeylonMigrationBot commented 12 years ago

[@RossTate] Huh?

I can make it work without any fancy features just by inserting CIterable<T> wrap<T>(JIterable<T>):

Sequence<T> singleton<T>(T t) { return {t}; }
JIterable<Integer> itr = ...;
return sameSums(singleton(wrap(itr))) && JsameSums(singleton(itr));

so I don't know what you mean.

CeylonMigrationBot commented 12 years ago

[@gavinking]

Ok, I still think we could do model loader tricks, the same as we "pretend" that a Java String really is a Ceylon String, but I give up

Huh? Isn't that exactly what I'm proposing?

CeylonMigrationBot commented 12 years ago

[@gavinking]

I vote for c.l.Iterable to extend j.l.Iterable

I think this is just an awful idea.

Now, I'm guessing Gavin's gonna retort with the fact that this means Ceylon's for has to have knowledge of Java's j.l.Iterable.

No, I'm going to retort that this means that ceylon.language has a dependency to the monolithic language module of a completely different language that is not even available on all platforms. So on some platforms c.l.Iterator has additional operations it does not have on others. From a philosophical and practical point of view that is just crazy!

CeylonMigrationBot commented 12 years ago

[@FroMage] @gavinking what about the problem I raised in #403#issuecomment-4261681 and the next comment? The thing is that autoboxing like we do for non-IdentifiableObjects is fine (String, Number…) but in the case of Iterator and Iterable it breaks identity.

CeylonMigrationBot commented 12 years ago

[@RossTate] Okay, so say you're writing a library, and you've decided you want to write it in Ceylon cuz it's this new cool language. However, you want your library to accessible, and Ceylon's not that big yet, so you want it to also be able to interact with Java code. As such, you want your classes to implement appropriate Java interfaces only when interoping to Java. You may even realize you're library would be useful for JavaScript, so you want your classes to implement appropriate JavaScript methods only when interoping with JavaScript. Otherwise, when you're interoping with just plain Ceylon code, then you don't care if those interfaces or methods are present (in fact you don't want them to be present).

This seems like something that may be common in Ceylon's early stages. In fact, it's already happening. We have our own collection library, and we want Java code to be able to use our collections as they would standard Java collections. However, as you say, we don't want to have a dependency on Java when we're not trying to interop with Java. So, I'm trying to propose a way to have Java compatibility without Java dependency. I'm hoping this solution would not only solve our problems, but future Ceylon-developers problems as well, while avoiding the hackiness of implicit coercions.

CeylonMigrationBot commented 12 years ago

[@gavinking] @RossTate To me that is really well beyond the goals we have for Java interop. It is explicitly a non-goal to be able to write libraries in Ceylon for other-JVM-language audiences. If that were a goal, it would call for a completely different design of the language module.

CeylonMigrationBot commented 12 years ago

[@gavinking] @FroMage a worse problem is:

Iterable<Object> i = ArrayList<Object>();
if (is ArrayList<Object> i) { ... }

So wrapping is truly problematic. But I think that suggests a solution.

What we could do is have the Java impl of c.l.Iterable extend j.l.Iterable and get the compiler to automatically filll in the implementation of iterator() on classes that satisfy it. Then we lower the type c.l.Iterable to j.l.Iterable everywhere in all client code. WDYT?

Note that this solution still calls for the model-loader magic on Java iterables, and also requires special handling of the iterator attribute of c.l.Iterable.

CeylonMigrationBot commented 12 years ago

[@FroMage] I thought about it a bit and ended up with this pseudo code:

interface CeylonIterable<T> extends java.lang.Iterable<T>{
    boolean getEmpty();
    CeylonIterator<T> getIterator();
}

interface CeylonIterator<T> extends java.util.Iterator<T>{
    Object getNext();
}

class CeylonIterableImpl implements CeylonIterable<String>{

    @Override
    public java.util.Iterator<String> iterator() {
        return getIterator();
    }

    @Override
    public boolean getEmpty() {
        return false;
    }

    @Override
    public CeylonIterator<String> getIterator() {
        return null;
    }

}

class CeylonIteratorImpl implements CeylonIterator<String>{

    Object cachedNext;
    boolean isNextCached = false;

    @Override
    public boolean hasNext() {
        // get the next one to see what happens
        if(!isNextCached){
            cachedNext = $userNext();
            isNextCached = true;
        }
        return cachedNext == ceylon.language.exhausted.getExhausted();
    }

    @Override
    public String next() {
        Object next = getNext();
        if(next == ceylon.language.exhausted.getExhausted())
            throw new NoSuchElementException();
        return (String)next;
    }

    @Override
    public void remove() {
        throw new UnsupportedOperationException();
    }

    @Override
    public Object getNext() {
        Object next;
        // make sure we use the cache if it's there
        if(isNextCached){
            next = cachedNext;
            isNextCached = false;
            cachedNext = null;
        }else
            next = $userNext();
        return next;
    }

    private Object $userNext() {
        return null;
    }

}

You can see that:

  1. If we want to make j.l.Iterable extend c.l.Iterable we are going to have to figure out how to provide an empty attribute.
  2. Our implementations of c.l.Iterator will have annoying stuff to deal with Java's hasNext(), such as moving the user code for getNext() into $userNext().
  3. We effectively have to reserve the hasNext, next, remove and iterator method names like we do for Java's hashCode and toString.
  4. We don't have anything like Java's Iterator.remove, which is another problem, but one that we should fix, because it's very useful in Java.

I have a feeling there are other issues, but I can't quite put my hand on them yet, I'll try again tomorrow.

CeylonMigrationBot commented 12 years ago

[@RossTate] I wouldn't have Ceylon's iterator extend Java's iterator. It's not necessary for the for syntax in either language, and it adds memory overhead that's completely useless when using Ceylon iterators as Ceylon iterators (which ideally is the common case).

CeylonMigrationBot commented 12 years ago

[@gavinking] Agreed, I don't think we need Ceylon iterators to be Java iterators.

Sent from my iPhone

On Mar 8, 2012, at 1:11 PM, Ross Tatereply@reply.github.com wrote:

I wouldn't have Ceylon's iterator extend Java's iterator. It's not necessary for the for syntax in either language, and it adds memory overhead that's completely useless when using Ceylon iterators as Ceylon iterators (which ideally is the common case).


Reply to this email directly or view it on GitHub:

403#issuecomment-4397583

CeylonMigrationBot commented 12 years ago

[@FroMage] How can the Java impl of c.l.Iterable extend j.l.Iterable without doing the same for Iterator?

CeylonMigrationBot commented 12 years ago

[@FroMage] Right, so one other issue is this:

Container i = ArrayList<Object>();
if (is ArrayList<Object> i) { ... }

So it looks like we have to do some more magic on Container, or drop it from Iterable.

CeylonMigrationBot commented 12 years ago

[@FroMage] I think we can make this work if we:

  1. Remove Container from c.l.Iterable
  2. Remove empty from c.l.Iterable
  3. Lower every mention of c.l.Iterable into j.l.Iterable except for satisfies clauses
  4. Make the Java impl of c.l.Iterable implement j.l.Iterable
  5. Add a generated j.l.Iterable.iterator() method for each class implementation of c.l.Iterable.iterator, which will call a utility method which wraps a c.l.Iterator as a j.u.Iterator
  6. Add a runtime type test for each call to c.l.Iterable.iterator since the underlying object may be of type j.l.Iterable and not have this property
  7. Add a special-case for is c.l.Iterable<T>
  8. Make the model-loader pretend that j.l.Iterable is c.l.Iterable
  9. Reserve and escape the iterator method name
  10. We lose the ability to call j.l.Iterable.iterator() since that doesn't exist in Ceylon-land anymore, which implies that we can't get access to any j.u.Iterator and we lose its remove() method.
  11. is j.l.Iterable<T> will likely return true for c.l.Iterable<T>
  12. for will have to use the j.u.Iterator to iterate, since we can't know at compile-time whether something really is c.l.Iterable or j.l.Iterable, which means wrappers even for Ceylon iterables. And this sucks.

At this point, is this really worth the trouble?

CeylonMigrationBot commented 12 years ago

[@quintesse] I don't like at all where this is going. We don't have a clear idea how the heck we could (reasonably) implement this. So as long as the proponents of this idea don't come up with a clear solution on how to handle this, I'd say forget about all of this. Just make for understand j.l.Iterable, generating specific code to handle it. In all other cases you'll just have to use the Java iterator directly or use a wrapper class (which we could provide). If in the future we notice that we're using an awful amount of those wrappers everywhere we can always revisit this issue after we've had some more time to think about it.

CeylonMigrationBot commented 12 years ago

[@FroMage] I'm with @quintesse here. This sounds much safer. If we don't want to make for iterate over java.lang.Iterable then let it rely on wrappers:

List<String> l = ArrayList<String>();
for(String s in toIterable(l)){
}

I'm not sure we can optimise this solution because the minute we lose the info of the type of Iterable (Ceylon or Java) at compile-time, we can't produce an adequate for loop. We certainly can't produce a single for loop that works for both types of Iterable at runtime since that would require lots of instanceof. Only allowing for to work with both types of Iterable will allow us to optimise the wrappers away.

I don't think it's unreasonable to say in the spec that for can iterate over platform-dependent iterables as well as the Ceylon one. Hell if we can say that the size of Integer is platform-dependent we can do that too. This would let us do the same for JavaScript if needed, WDYT @chochos?

@quintesse: BTW, how does this relate to the issue of being able to iterate over arrays efficiently?

CeylonMigrationBot commented 12 years ago

[@chochos] Javascript doesn't even have the concept of iterators, so we only need to deal with c.l.Iterator; we have several (mostly private) Iterator implementations that are used by each collection (RangeIterator, ArraySequenceIterator, SingletonIterator, etc). for in js is implemented like this:

var tmpvar1=whatever.getIterator();
var tmpvar2;
while ((tmpvar2=tmpvar1.next()) !== exhausted) { ... }
//and if there's an else then we have this after the loop
if (tmpvar2 === finished) { ... }
CeylonMigrationBot commented 12 years ago

[@quintesse] I was suggesting to just allow Java Iterables and generate specific code for them, so the above code would simply be:

List<String> l = ArrayList<String>();
for (String s in l) {
}

This specific case, which will most likely be the most common, will just be handled transparently.

It won't work when trying to pass Iterables around as parameters, but when are you going to do that? I mean, Java collections and Ceylon collections are not compatible anyway, so the only thing I can imagine is that you have code that can handle both Java en Ceylon collections and want to iterate over them in a generic way. But why is that useful if almost anything else you're going to do with those collections will need separate code anyway?

@FroMage: the same thing more or less, if you do:

Array<String> arr = array("aap", "noot", "mies");
for (String s in arr) {
}

the code generated for the for-loop will be specific for that case and could be something like:

Array<String> arr = ....
for (int $tmp = 0; $tmp < arr.length; $tmp++) {
    final String s = arr[$tmp];
}

(of course, like we already discussed on chat, we will not be able to do any of this with the current design of Array)

CeylonMigrationBot commented 12 years ago

[@FroMage]

Javascript doesn't even have the concept of iterators

Well, no but what about JavaScript's Array type? How do you iterate over it in Ceylon/JS?

CeylonMigrationBot commented 12 years ago

[@FroMage]

the code generated for the for-loop will be specific for that case

So if you cast the Array<String> into an Iterable<String> you get the same issue of it being impossible to optimise, right?

CeylonMigrationBot commented 12 years ago

[@chochos]

Well, no but what about JavaScript's Array type? How do you iterate over it in Ceylon/JS?

You don't... interop in ceylon-js is necessarily one-way only (you can call Ceylon-generated JS from vanilla JS from you can't call JS code from Ceylon). If you want to pass a js array to Ceylon, you can wrap it in an ArraySequence:

var vanilla_array=[ blabla ];
someCeylonObject.methodExpectingSequence($$$cl15.ArraySequence(vanilla_array));
CeylonMigrationBot commented 12 years ago

[@quintesse]

interop in ceylon-js is necessarily one-way only

You say that but I'm not sure I agree, I'm pretty sure we'll need some kind of way to call into native JS to be able to do more interesting things. Like for example access to the HTML DOM model inside a browser (see this Kotlin demo for example http://kotlin-demo.jetbrains.com/?folder=Canvas&name=Hello,_Kotlin)

CeylonMigrationBot commented 12 years ago

[@chochos] Access to the DOM model is one thing, and we've already said it's going to be quite problematic; but having a full way to call JS code from Ceylon code, I think that's another thing entirely, and if you look at it from the Ceylon side, it's the same as having interop with ruby or any other language that doesn't even reside in the JVM.

CeylonMigrationBot commented 12 years ago

[@chochos] mmmm magic functions getContext() and getCanvas(), I wonder how those are implemented.

CeylonMigrationBot commented 12 years ago

[@quintesse] Well I understand that we can't just call any JS code that happens to exist, but not even if it's available as a CommonJS module?

CeylonMigrationBot commented 12 years ago

[@quintesse] @FroMage sure, but I don't see any way around that, the moment you pass an Array along to a method taking an Iterable it could be anything, including stuff that can't be optimized. If you want you can have something like this, which is pretty ugly I must admit:

void loop(Iterable<T> i) {
    if (is Array<T> i) {
        for (T t in i) { ... }
    } else {
        for (T t in i) { ... }
    }
}

and this would actually generate two different for-loops.... bit icky, isn't it?

NB: in "our" idea of handling Arrays (with type-specific subtypes) we could make the iterator much more efficient because we can get rid of a lot of boxing. Still, not as good as a plain old for (int i = 0; i < array.length; i++)

CeylonMigrationBot commented 12 years ago

[@chochos] The problem is not whether the JS code exists as a CommonJS module; the problem is that at compile time, we just don't know if that code exists, what it does, what methods it responds to, etc. No way to validate that...

CeylonMigrationBot commented 12 years ago

[@quintesse] Really? I thought that was what CommenJS was for, that it has this "metadata" where you can ask for dependencies and which "methods" are exported. But yeah, the problem of course always is: what are the parameters and returns values "made of". Which would bring us back to the whole discussion of dynamic or unchecked dispatch and this is not the thread for that. Sorry I brought it up here.

CeylonMigrationBot commented 12 years ago

[@FroMage] This is likely to slip to M3, removing high prio.

CeylonMigrationBot commented 12 years ago

[@FroMage] Slips to M3.

CeylonMigrationBot commented 11 years ago

[@FroMage] Mmm, I still have lots of code for this lying around in my branch. Do we need to revive this?

CeylonMigrationBot commented 11 years ago

[@quintesse] Not sure, what is it you have right now?

CeylonMigrationBot commented 11 years ago

[@FroMage] That the for loop works for Java Iterable types.

CeylonMigrationBot commented 11 years ago

[@FroMage] Reassigned to @gavinking to see if he wants for loop support for java iterables or not. I think this will slip to M5.