Open aaime opened 4 years ago
One of the goals we had when adding PyObject was to allow developers to access the python internals more naturally and create layers on top of jep that can translate arbitrarily complex objects. Iterators are pretty core functionality and I agree it would be helpful if jep could support them but there are some complexity with type checking and generics that would need to be sorted out and a python generator is probably more closely translated to a Java Supplier. In the mean time you can use PyObject to incrementally call into the generator or even make your own Iterator implementation like this:
public class PyIterator implements Iterator<PyObject>{
private final PyObject pyObject;
private final PyCallable nextMethod;
/* Needed to implement hasNext */
private PyObject next;
public PyIterator(PyObject pyObject) throws JepException{
this.pyObject = pyObject;
this.nextMethod = pyObject.getAttr("__next__", PyCallable.class);
}
@Override
public boolean hasNext() {
if(next == null){
try{
next = nextMethod.callAs(PyObject.class);
} catch (JepException e){
/* This just assumes it is a StopIteration exception */
}
}
return next != null;
}
@Override
public PyObject next(){
if(next == null){
if (!hasNext()){
throw new NoSuchElementException();
}
}
PyObject next = this.next;
this.next = null;
return next;
}
}
And here is an example snippet of how to use that code:
public static void main(String[] args) throws JepException {
try (Interpreter interp = new SharedInterpreter()) {
interp.exec("generator = (x**2 for x in [1, 3, 6, 10])");
PyObject pyObj = interp.getValue("generator", PyObject.class);
Iterator<PyObject> it = new PyIterator(pyObj);
while(it.hasNext()){
System.out.println(it.next());
}
}
}
@bsteffensmeier
I have trouble with this approach when I try to iterate over a tuple with a java object inside. For some reason it always throws a StopIteration exception, when encountering the java object. Is this expected? My expectation would be to also receive a PyObject wrapper around the PyJObject, which returns the java object, when I call PyObject.as(Object.class).
I checked that explicitly requesting a PyObject with the getValue API also fails. See the test below:
public class PyIterator implements Iterator<PyObject>{
private final PyCallable nextMethod;
/* Needed to implement hasNext */
private PyObject next;
public PyIterator(PyObject pyObject) throws JepException{
this.nextMethod = pyObject.getAttr("__next__", PyCallable.class);
}
@Override
public boolean hasNext() {
if(next == null){
try{
next = nextMethod.callAs(PyObject.class);
} catch (JepException e){
/* This just assumes it is a StopIteration exception */
}
}
return next != null;
}
@Override
public PyObject next(){
if(next == null){
if (!hasNext()){
throw new NoSuchElementException();
}
}
PyObject next = this.next;
this.next = null;
return next;
}
}
@Test
public void testIterator() throws JepException {
try (SharedInterpreter interpreter = new SharedInterpreter()) {
Object javaObj = new Object();
interpreter.set("javaObject", javaObj);
interpreter.eval("t = (1, 2, javaObject, 4)");
PyIterator tuple = new PyIterator(interpreter.getValue("t.__iter__()", PyObject.class));
List<PyObject> list = Lists.newArrayList();
while (tuple.hasNext()) {
list.add(tuple.next());
}
List<Object> backToJava = list.stream().map(this::toJava).collect(Collectors.toList());
assertEquals(Lists.newArrayList(1, 2, javaObj, 4), backToJava); //will be (1, 2)
//also fails
PyObject pyJObject = interpreter.getValue("javaObject", PyObject.class);
//jep.JepException: <class 'TypeError'>: Expected jep.python.PyObject but received a java.lang.Object.
assertTrue(pyJObject instanceof PyObject);
}
}
public Object toJava(PyObject py) {
try {
return py.as(Object.class);
} catch (JepException e) {
fail();
return null;
}
}
Why do I want to do this? Because I want more control over how the python objects are transformed (on the java side). E.g. a python long is transformed to a java long, but I'm getting an "int too big" exception, when the python long value exceeds the max long value. I hope to be able to intercept here on the java side and receive the string representation of the python long to construct a BigInteger from it. I have a method interceptor hooked up in jep, to delegate the method call to the java side, passing the args and kwargs as PyObjects (tuple and dict). Like this, I am also able to still adapt to the keyword argument support, which was implemented based on jython. Initially, I let jep do the auto-conversion and was expecting List
Any idea how I could achieve this?
I didn't try it but I think your root cause would be a TypeError from here. I don't think the jep exception ignored in the hasNext() method is actually a StopIteration exception, you may be able to get that TypeError by looking at the JepException thrown there.
I think you are landing here. You are trying to convert a python object to a java object and the python object is a PyJObject. The only conversion in that block is if your object is compatible with the expectedType, which it is not so you will fall through to the end of those checks and raiseTypeError().
You could add an else clause in there to convert to PyObject like this:
} else if (PyJObject_Check(pyobject)) {
PyJObject *pyjobject = (PyJObject*) pyobject;
if ((*env)->IsAssignableFrom(env, pyjobject->clazz, expectedType)) {
return (*env)->NewLocalRef(env, pyjobject->object);
} else if ((*env)->IsSameObject(env, expectedType, JPYOBJECT_TYPE)) {
return PyObject_As_JPyObject(env, pyobject);
}
} else ...
One potential problem with that is you will have the same problem with the checks above this case so PyJClass and PyJArray cannot currently be converted to JPyObject. You could add the same type of else statement or there is probably a more elegant way to organize that if/else block.
Yesterday, when I debugged this, I was quite sure to have seen a stop iteration exception, but just now I double checked in my test and indeed the swallowed JepException is a TypeError!
Your suggested change makes my test case pass - so it works perfectly. :) I will give it some more cycles, but it looks very promising, the jep tests are also passing. I will also check about reordering to cover more cases. Do you think it also makes sense to get this into the developer branch? It makes working with PyObjects on the java side more fluent.
Thanks for the pointer!
Do you think it also makes sense to get this into the developer branch?
Yes. I think it almost always makes sense to add a conversion where we currently fail.
My one concern with this is I want to try to avoid a case where we could have multiple layers of wrappers, I don't want to get a java PyObject wrapping a python PyJObject that is wrapping a java PyObject. I don't think that is a major concern with this change since as far as I can tell it would be impossible to get a python PyJObject wrapping a java PyObject and also if this code does come across a python PyJObject wrapping a java PyObject it would return the wrapped PyObject rather than rewrapping. But it is something to keep an eye out if you consider further restructuring.
I'm working with a library that uses generators a lot, to return potentially very large lists of values (several millions).
It would be nice to map the the python generators to a similar concept in Java, like an Iterator (or even a Stream, if Java 8 is a target).
Currently I'm extracting a list out of the generator in python, which limits what I can do, to what can be stored in memory.