bcdev / jpy

A bi-directional Python-Java bridge used to embed Java in CPython or the other way round.
Apache License 2.0
187 stars 37 forks source link

PyObject shall not be Serializable #150

Closed desruisseaux closed 5 years ago

desruisseaux commented 6 years ago

PyObject shall not be Serializable. This is a reverse of pull request https://github.com/bcdev/jpy/pull/80. As pointed-out in the comment of that pull request, making PyObject serializable is very dangerous - almost suicidal - because the PyObject.pointer value will be invalid if the object is deserialized by a different JVM than the one that serialized it, which is likely to cause a JVM crash.

Even if the object is serialized and deserialized by the same JVM, it is still dangerous because PyObject.finalize() decrements the Python reference count. If a PyObject is serialized, then garbage collected by JVM, then "garbage collected" by Python because the reference count reached zero, then deserialized, we will still have a JVM crash even if no JVM boundary has been crossed.

Note also that there is nothing is current code for increasing the reference count on deserialization, thus causing an invalid count. Garbage collection after deserialization will cause the reference count to reach zero prematurely, and even become negative afterward.

desruisseaux commented 5 years ago

Would it be useful if I add a simple test case that reproduce the crash? Basically just serializing and deserializing on a different JVM, or on the same JVM but after the garbage collector has been run, should demonstrate a crash.

forman commented 5 years ago

I agree. But Scala/Java in same VM use case is still a valid one. It would be better if we'd use PyObject factory that can produce a SerializablePyObject if jyp is configured to do so. I'll create an issue for this.

desruisseaux commented 5 years ago

Note that even with Scala/Java in the same VM, it is likely to crash in current state because the reference count is not incremented at deserialization. This can cause PyObject.finalize() to discard a Python object while it is still in use by another PyObject.

forman commented 5 years ago

Right.

desruisseaux commented 5 years ago

You are welcome, thanks for the merge!