jpype-project / jpype

JPype is cross language bridge to allow Python programs full access to Java class libraries.
http://www.jpype.org
Apache License 2.0
1.12k stars 183 forks source link

Python 3.13 problems #1204

Closed astrelsky closed 1 month ago

astrelsky commented 4 months ago

This is now problematic because PyType_IS_GC implies it is on the heap and the pointer gets cast to PyHeapTypeObject. It will eventually read memory on the stack and interpret it to be some huge size and it will result in a MemoryError.

https://github.com/jpype-project/jpype/blob/653ccffd1df46e4d472217d77f592326ae3d3690/native/python/pyjp_value.cpp#L51-L59

I have tried replacing it with PyUnstable_Object_GC_NewWithExtraData but it can't handle nitems > 0. So I tried the following:

#if PY_VERSION_HEX>=0x030d0000
        PyHeapTypeObject heapType = *(PyHeapTypeObject *)type;
        PyTypeObject type2;
        type2.tp_basicsize = size;
        type2.tp_itemsize = 0;
        type2.tp_name = nullptr;
        type2.tp_flags = type->tp_flags;
        type2.tp_traverse = type->tp_traverse;
        heapType.ht_type = type2;
        obj = PyObject_GC_New(PyObject, &heapType.ht_type);
#else

That works fine until here https://github.com/jpype-project/jpype/blob/653ccffd1df46e4d472217d77f592326ae3d3690/native/python/pyjp_value.cpp#L215

Unfortunately it then tries to access (PyDictValues *)((char *)obj + sizeof(PyObject)) when it is not valid. The value at PyDictValues->valid is not zero so its checks pass and it moves forward until it faults. I suspect the java slot is being put isomewhere python is using and causing memory corruption but I haven't been able to prove it or find out what is really wrong.

Thrameos commented 4 months ago

They have made it very difficult to allocate memory with differing layouts. I tried to propse a fix, but it required putting in a PEP. Unfortunately I was almost immediately hit with an emergency at work and have been unable to find any time since.

The solution to this problem would be to create an actual heap type first then use it to allocate the desired memory rather than this static type hack. Or they need to give back access to making gc allocations. Given there is a slot for allocators, removing the gc call private basic deprecates the slot function.

I wish I had the time to persue the PEP but I have another 3 months of work before I can cycle back to major work here.

astrelsky commented 4 months ago

What I'm having trouble understanding is why this is even necessary at all? Shouldn't you be doing something like this?

Mostly copypasta from https://docs.python.org/3.13/extending/newtypes_tutorial.html#the-basics

struct JPValueObject {
    PyObject_HEAD;
    JPValue slot;
};

static PyTypeObject JPValueType = {
    .ob_base = PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "custom.Custom",
    .tp_basicsize = sizeof(JPValueObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT,
    .tp_new = PyType_GenericNew,
};

JPValue* PyJPValue_getJavaSlot(PyObject* self)
{
    if (PyObject_IsInstance(self, (PyObject *)&JPValueType)) {
        return &((JPValueObject *) self)->slot;
    }
    return nullptr;
}

The only one I foresee causing trouble would be PyJPClass since it's a metaclass and metaclasses that override tp_new are not allowed.

I'm sure there was a reason, if only at one point in time, for doing it the way it is, I'm just not seeing why.

Thrameos commented 4 months ago

Unfortunately you missed something vital. The Java slot can't appear in the structure at all. We are using this as a mixin because we must be compatible with all possible Python concrete types. If you try to put it in the structure then it will conflict with int, float, exception etc which we must derive from as all base types are concrete. We thus used the alloc slot to adjust shift the Python structure down 8 bytes and added the needed data in front of the Python structure where we could always guarantee access at order 1. The issue is that Python developers decided to start placing their info in front to the Python object with no api to handle conflicts. This basically breaks the whole concept of the alloc slot as you have no way to alloc object memory with variable size.

I wrote a PR to give an API but have no way to pursue it.

astrelsky commented 4 months ago

So it seems to be a combination of problems. First is the one already discussed and the second I think is changes to how a type is created which makes PyJPClass_FromSpecWithBases no longer work properly.

I had hoped to be able to figure out the problem myself and have my employer submit the fix. Unfortunately it would probably require significant refactoring that would take me months, due to the lack of familiarity with the jpype internals, and I'm not able to justify it. It seems the best way would be to tell you what I know and let you handle it.

The only thing I could think of that might not require a major rework would be to stick the JPValue in a python object and assign it as a member def. This pretty much defeats the whole purpose of having the type implemented in c++ though.

Also, c++ not supporting c99 designated initializers is extremely annoying.

Thrameos commented 4 months ago

Thanks, I will take a look.

--Karl

Thrameos commented 3 months ago

I am still working this one, but thus far it looks hopeless. They have removed all access to the GC (alloc, link, but still have track). Thus creating a working allocation using the alloc slot means either calling into their available calls as a side effect to get an allocated and linked object or replicate their private structures and somehow gaining access to the current state.

Then they used the custom allocation slot (which we have been using for a while) and allocated extra memory before the object and extra memory behind other objects (which was never reported to the allocator). The problem with JPClass is that types request 40 extra bytes at the end even though they have an item count of zero. Thus the crash was just making a call to the type code while using a custom allocator. My allocator does know that they need an extra 40 bytes (and that is where I was going to place my memory.) While other objects just allocate the correct space.

I could go back to adding my extra memory in front of the GC head, but they have now moved the weak reference tables and the dictionary pointer there. And as the allocations are all hard coded I can't politely ask for a few extra bytes that I require.

They were going to give a mechanism for user requested data, but it doesn't handle their broken object layout meaning that it is useless for much other than a generic object. I tried to send a fix, but they requested a PEP which given I have been going 60 hours a week for the last 9 months just can't happen.

Thus I am back to horrible kludges. I was allocating my object by adding extra space to a temporary type then mutating it to my object. But that wont work now that they are using custom allocators that lie about the amount of space to allocate. My memory and their extra memory will collide and I would need a secret table of types to figure out which classes need custom allocators. I can't add to the front because they are using the front already, and they have barred access to the allocator, meaning unless I can post mutilate to move the GC head that route is out. Adding to dictionary (very slow) doesn't work because some types don't accept dictionaries. They already did bad things like reusing the ob_size field with integers as no one was accessing it (I was!) The more they mutate and kludge their type system without any long term plan of how someone who needs O(1) access to multiple inheritance, the harder it will be maintain something like JPype. The fact that they are using the allocator slots for their classes and I was just doing the same should have been a clue that they needed a formal API. Instead they have just rendered it impossible to carry out the same patterns they use internally. It would have cost nothing to just make PyObject_GC_Alloc and PyObject_GC_Link accessible to the user as those two are required to create custom memory allocation for GC types.

@marscher I have found one pattern that appears to work, but I have to merge it and test with every prior version. Unfortunately if someone can't convince the Python developers to merge my patch for user supported data into CPython, we may be down to our last supported version of JPype. The developers have hidden all of the required functions to replicate the allocator functions, placed assertions that preview mutating the object during allocation to add a few extra bytes, and have taken over both the area in front and behind the object so without an API to support user data in multiple inheritance I am out of options. I tried to send a patch to prevent this outcome, but they sat on it for months, then told me that I had to write a PEP. Unfortunately by the time they got back to me it was a crisis at work and I have been at 60-80 hr work weeks without a break for months, and it likely won't let up soon. Someone else would have to step up to push this forward.

astrelsky commented 3 months ago

I have some ideas that should create a path forward with less issues, it will just require a non-trivial refactor. The main caveat would be that each JClass instance would require two python object allocations. With the exception of primitive types and exceptions, they would be a generic object where you could place the jobject instance at the end. I think having each type have it's own allocator where necessary instead of all of them using the same allocator will reduce the amount of problems going forward. I expect to run into a handful of roadblocks but it is a problem I am interested in and have no problems spending all of my personal time on it. (Note: communication is difficult for me, if this didn't make much sense, don't worry about it).

The biggest challenge I'll likely face is getting it working and finishing before I lose interest 😅

Thrameos commented 3 months ago

There were serious issues with the two object approach which is why it was abandoned in Jpype 0.7. The biggest being Isa relations. Primitives, exceptions and the like can't be duck typed to meet their contracts. The must be inherited directly from the Python objects they seek to emulate. They also have there own incomparable memory layouts. And for float and long you can't add anything to the base. You can make a special exception type for it's objects.

This brings about the second issue which is the look up time. If you have many layouts then figuring out if it is a class derivative takes time. With the double object we used this was a substantial number of calls. And to test a method overload yous have to check for it for each combination of overloads. If we break that jpype gets rather slow. We could bypass this requirement by adding a slot offset to the type object meta class.

All the other language binders have the same problem and thus a get user data method was to be added to basic Python object system, but the implementation missed as it was still incomplete with int, float, and exception (they didn't test can this inherit from every python type). The solution is to fix their type system by making a facility for any type to access user data in O(1). It isn't hard as I already submitted a patch.

Trying to fix it at jpype end as they have constantly changing their type system is a losing battle. I would recommend taking the time for the for all solution which is trivial (negative baseline implies the data will be added to the preheader) rather than reworking jpype where you will get a dozen edge cases.

astrelsky commented 3 months ago

There were serious issues with the two object approach which is why it was abandoned in Jpype 0.7. The biggest being Isa relations. Primitives, exceptions and the like can't be duck typed to meet their contracts. The must be inherited directly from the Python objects they seek to emulate. They also have there own incomparable memory layouts. And for float and long you can't add anything to the base. You can make a special exception type for it's objects.

This brings about the second issue which is the look up time. If you have many layouts then figuring out if it is a class derivative takes time. With the double object we used this was a substantial number of calls. And to test a method overload yous have to check for it for each combination of overloads. If we break that jpype gets rather slow. We could bypass this requirement by adding a slot offset to the type object meta class.

All the other language binders have the same problem and thus a get user data method was to be added to basic Python object system, but the implementation missed as it was still incomplete with int, float, and exception (they didn't test can this inherit from every python type). The solution is to fix their type system by making a facility for any type to access user data in O(1). It isn't hard as I already submitted a patch.

Trying to fix it at jpype end as they have constantly changing their type system is a losing battle. I would recommend taking the time for the for all solution which is trivial (negative baseline implies the data will be added to the preheader) rather than reworking jpype where you will get a dozen edge cases.

I am tracking the conversation in the pull request. Hopefully things go in the right direction. For now I will sit down and be quiet.

Thrameos commented 3 months ago

Sounds good. There are only three changes in Python that are needed to eliminate all of the black magic that we are doing.

1) The multiple inheritance patch which adds our data. 2) A method to inject stack frames from Java exceptions. 3) A way to lazy load strings so that Java and Python strings can be unified.

Many language binding (ie C#) need these same pieces. Each isn't that complex, but it will take time and effort to press them forward.