py4j / py4j

Py4J enables Python programs to dynamically access arbitrary Java objects
https://www.py4j.org
Other
1.19k stars 217 forks source link

Long type Map key problem with Python 3 #374

Open marianp123 opened 5 years ago

marianp123 commented 5 years ago

I have this problem when running with Python 3.7, it works fine with 2.7. I make a call from Python to a Java method returning a java.util.Map with Long type key. JavaMap is returned with entries having correct keys (as int-s as there is no long type in Python 3), but None values. I understand that Py4j converts Python int values to Java long for values over the java integer range, which is reasonable, but this case is different and should be handled by Py4J in my opinion. The only workaround is to use integer Map keys on the Java side.

`public Map javaMethod() { Map map = new HashMap(); map.put(25L, 26); return map; }

_dict = gateway.JavaClass.javaMethod() print(_dict) # prints: {25: None}

... map.put(((long) Integer.MAX_VALUE), 26);

_dict = gateway.JavaClass.javaMethod() print(_dict) # prints: {2147483647: None}

... map.put(((long) Integer.MAX_VALUE) +1, 26);

_dict = gateway.JavaClass.javaMethod() print(_dict) # prints: {2147483648: 26}

`

thanks

qvad commented 5 years ago

@marianp123 I don't think there is any chance to catch this in the py4j side. In case of generic types, you have no information about value type in runtime. Otherwise, while you calling Java method from py4j you have all information about method arg types.

So the solution with looping through all number types while getting generic collection value looks strange. (You'll have the same problem with Byte, Short etc...)

I fixed this behaviour by creating special method on Java side and this is very ugly

@SuppressWarnings("unchecked")
public Object getTypedKey(Cache cache, Object key, String type) throws ClassNotFoundException {
    switch (type.toLowerCase()) {
        case "long":
            return cache.get(createLong((Number)key));

        case "byte":
            return cache.get(createByte((Number)key));

        case "short":
            return cache.get(createShort((Number)key));

        default:
            return cache.get(Class.forName(type).cast(key));

    }
}
marianp123 commented 5 years ago

@qvad Java reflection provides information about a collection's generic types. Py4j returns a JavaMap->Dictionary object. Since it's doing the conversion, it should take care of the special case. The JavaMap object contains the key, but no value. I have no option to retrieve the value unless I change the Java side method. I had this conversion problem in other cases, I am not sure if long wouldn't be a better default than integer. it's a tricky problem. an option to customize the behavior per call might be the best.

bartdag commented 5 years ago

Hi, this is currently a limitation of Py4J and there is no easy way to get around that.

As you noticed, the merge of int and long type in Python 3 makes it impossible to be 100% right about type conversion from python to java without explicit type declaration/hint.

The code responsible for converting a Python int to a Java primitive in Py4J (simplified for this case) looks like that:

    ...
    elif isinstance(parameter, int) and parameter <= JAVA_MAX_INT\
            and parameter >= JAVA_MIN_INT:
        command_part = INTEGER_TYPE + smart_decode(parameter)
    elif isinstance(parameter, int):
        command_part = LONG_TYPE + smart_decode(parameter)

If you try to pass a gateway.jvm.Long(25) as your dictionary key, the Long instance will first get converted to an int on the Python side because of autoboxing and autounboxing behavior in the JVM, so this is not an option.

I think it would be relatively easy to add some type hints when making a call from Python to Java, but you would have to do it explicitly for each call, which is not desirable in all cases, e.g.,:

# return _value is a short in Java, but it gets converted to an int in Python
return_value = my_java_object.get(TypeInt(25, JavaType.PRIMITIVE_LONG))

return_value2 = my_java_object.get(TypeInt(return_value, JavaType.PRIMITIVE_SHORT))

If someone wants to take a shot at a PR to implement type hinting, I would be willing to help and advise on which part of the code needs to be changed.

marianp123 commented 5 years ago

@bartdag thanks for information. I'd gladly prepare the PR. Might take me a few weeks to get to it though, it's summer :)

srilman commented 2 years ago

Hi. I'm dealing with a similar issue when trying to construct a list of Java Long objects to pass to a Java function. Is this still being worked on? Would be happy to try and make a PR for the type hinting

HyukjinKwon commented 2 years ago

There have not been updates here after https://github.com/py4j/py4j/issues/374#issuecomment-513463892. PR is very welcome!

srilman commented 2 years ago

I've opened up a PR for this! Sorry for the delay!