tonybaloney / CSnakes

https://tonybaloney.github.io/CSnakes/
MIT License
244 stars 18 forks source link

Remove PyObject usage from the CPythonAPI, refactor PyObject to WrappedPyObject, Contract based optimizations #272

Open minesworld opened 4 days ago

minesworld commented 4 days ago

Lets set three goals:

Different developers have different needs. As of now CSnakes simply works "out of the nuget package" in all cases but at the cost of overhead. Will be fine this way for those whose code spends most if its time Python code compared to how fast accessing the results is on the C# side. Different story for those who will present the results in a WinUI: its a bad user experience if the CLR GC kicks in while scrolling thru a list - chances that the user will blame the correct one are minimal...

Remove PyObject (as a C# object) usage from the CPythonAPI

This code in the CPythonAPI is simply "wrong" (already refactored to use a Raw interface for the DLL calls):

    internal static nint PyList_GetItem(PyObject obj, nint pos)
    {
        nint item = Raw.PyList_GetItem(obj.ptr, pos);
        if (item == IntPtr.Zero)
        {
            throw PyObject.ThrowPythonExceptionAsClrException();
        }
        Raw.Py_IncRef(item);
        return item;
    }

CPythons C-API PyList_GetItem does not throw an exception and does not call PyInrecRef on its own.... .

While the code does the right thing from a C# developers perspective, to C developers knowing CPythons C-API its "confusing" in two ways:

We will make it clear by having CPythonAPI using only IntPtr, no C# wrappers. Maybe additionaly defining PyObject as a IntPtr so translations of C code need even less typing.

As a C# wrapper is used, this code doesn't belong in the CPythonAPI at all.

Refactor PyObject to WrappedPyObject

There are some Python bridges which have the ability to modify implementations from both sides transparently. In effect a Python object can be an "interesting" mixture of C# and Python calls when used in Python code from the PythomInterpreter... This isn't the goal of CSnakes.

Also a problem arises when there is an overidable C# interface for CPython objects - when provided defintions are changed the outcome might have some unwanted side-effects which might be hard to debug.

The current usage of PyObject is three fold:

Renaming the class from PyObject to WrappedPyObject makes it clear what this class does.

Renaming all functions which take a WrappedPyObject as a parameters to WrappedPyXXX() makes it clear that C# GC and Exception rules apply for all parameters as its result.

So the above PyList_GetItem would be:

    internal static nint WrappedPyList_GetItem(WrappedPyObject obj, nint pos)
    {
        nint item = Raw.PyList_GetItem(obj.ptr, pos);
        if (item == IntPtr.Zero)
        {
            throw PyObject.ThrowPythonExceptionAsClrException();
        }
        Raw.Py_IncRef(item);
        return item;
    }

Contract based optimizations

CSnake can only optimize correctly "automagically" if the developer using its APIs assures that his usage of Python does something NOT.

Such an usage assurance is a Contract from the developer made with CSnakes. If the developer breaks a given contract - neither CSnakes fault nor problem.

If there is no Contract made, CSnakes will always work at the cost of the needed overhead. In case that is to much overhead, the developer can go thru the list of available contracts and made those ones applyable for his code/use case.

The Contract API should be made in a way that a developer willing to go the extra mile could add his own contracts and optimizers so the need to implement is minimized in regard to CSnakes source code changes.

A Contract should fail at creation time if a wrong condition is detectable. Like creating a Contract assuming a Global GIL when the CPython DLL is compiled in different.

Once a Contract is made it should not be checked again if its valid as this would imply overhead. Its not the wrong code path applied by CSnakes - its the wrong Contact made be the developer...

Optimizations enabled by Contracts are never implemented wthinin the CPython-API calls, those given IntPtr/PyObject, but either within WrappedPyObject calls or as utility functions to them. Lets see what makes the most sense.

If a developer uses calls below the CSnakes API where it is assured by CSnakes that Contracts are respected - not CSnakes problem. We will make it clear in the documentation that such a mixed usage might have side-effects and is not advised by CSnakes.

Examples why Contracts are the way to go:

These optimization are a piece of cake to implement compared to designing a clean Contract/Optimization API. But it should be worth that effort as CSnakes should become the "go to" option for developers...

minesworld commented 2 days ago

Here is a preview of my work in progress: https://github.com/minesworld/CSnakes/commit/73d2caf2380f7299a552de9accc096d4c6a447a8

Refactoring

The new big picture is three layered base classes in the Runtime:

still "some" refactoring to do - but looks very promising. CSnakes.Tests and CSnakes.Runtime.Tests pass...

These classes must be made public - a developer should be able to write code using:

When refactoring is done, it should be possible to mix these layers as wanted or needed. Thus someone can begin with the PythonObject layer, benchmark it and go way down where needed...

Context object

Whats missing and "coming soon" - a "genius" but still obvious Context object !!!

We all know CPython context already:

var gilState = CAPI.PyGILState_Ensure();

# some CSnakes.Runtime.Unmanaged.CAPI and/or CSnakes.Runtime.CAPI calls

CAPI.PyGILState_Release(gilState);

As doing something on the GIL or ensuring ThreadState might be expensive, most code tries to avoid that as shown above... But WHAT happens in THIS code?

var gilState = CAPI.PyGILState_Ensure();

# some CSnakes.Runtime.Unmanaged.CAPI and/or CSnakes.Runtime.CAPI calls
# some CSnakes.Runtime.Python usage...

CAPI.PyGILState_Release(gilState);

NOT what the developer intended: NOT calling the GIL. Chances are some

using (GIL.Acquire()) {}

are being called within CSnakes.Runtime.Python ... not good. And the developer can not change that without forking CSnakes .

Context object to the rescue !!!

It manages the GIL. So on instead in CAPI code creating a GIL a Context is created and its GIL aquired. When a PythonObject is created, that Context object is stored there too. So it can check in using (Context.AquireGIL()) {} only if that is really needed...

Can't wait to implement and benchmark it - maybe some LINQ on 10.000 PythonObjects using 5 attributes? Should make a difference NOT calling GIL.Aquire at least 50.000 times...

But - thats just the bare mimimum. What about Multiple Interpreters ?

Someone writing CAPI code will be aware that using CPython objects from multiple interpreters together is a "no go". His fault if doing that...

But: passing wrapped in PythonObject to C# code "somewhere" it might get overseen. So a Context object could also store from WHICH interpreter the CPython object originated. And the C# code could fail gracefully instead CPython producing really nasty bugs...

Optimization Contracts

"Genius"? No - but THIS is: putting Optimization Contracts into the Context !!

First of all - pure C-API calls can't be optimized as those a just calls into the DLL. But there a plenty of non C-API calls in all three layers. Those located in Unmanaged.CAPI and CAPI have a different nameing scheme, so they are distinguishable as Non C-API calls.

These calls will get the Context as an extra parameter, so different code paths could be executed depending on which Contracts are available and applyable.

I expect huge gains from functions which give strings to the C-API. Caching or even better a UTF16-first CPython implementation - near zero overhead will be possible.

In fact, the DLL calls on Unmanaged.CAPI and CAPI are done WITHOUT string marshalling. THAT is done within those special named functions where needed...

BTW: if a developer makes a no multiple interpreter Contract, that checks don't need be done...

Interested or should I rename my fork?

@tonybaloney please take a look at it. Suggestions are welcome... some name changes might be needed. And I'm not an expert in C# class access rights. The goal is that the developer can write his code under whatever namespace he likes using CSnakes.Runtime.CPython ...

PS: if we have a working optimizing prototype - I bet this will attract a lot of developers.

What about a Contract that some class definitions wont change? With a custom CPython DLL it might be possible to cache the lookup for a class and shorten that path. Its like what in Objective-C was possible: looking up the implementation of the call and then calling that directly in a loop... . Beside that: Objective-C had a per class lookup cache anyway, but why destroy that?

We have full control over the sources - on the CSnakes and CPython side. The sky is the limit, we "only" need "crazy developers" wanting to exploit that... And who knows - in a bright future those principles might be used by the CPython core team too. At the moment the type annotations are hints for the IDE editor. But think about modules specifying what contracts they fullfill..

PPS: before someone comes up with "reasons" why that GIL code example might fail and thus can't be used - the solution is called 'Contracts" ...

tonybaloney commented 2 days ago

There's a few reasons why the CPythonAPI wrapper doesn't just map 1:1 to the CPython API and why we don't expose it directly.

The main reason is that the CPython C-API is a mess. It is enormously inconsistent whether it returns a borrowed reference, a stolen reference, a strong reference or a new reference. Sometimes, the reference guarantee changes between CPython versions! For example, fetching frame objects changed from a borrowed to a new reference in Python 3.12.

And sometimes the documentation is wrong.

I don't really want to maintain a low-level CPython API wrapper. That's not the goal of this project either. We're trying to build a safe, friendly API to simply call a Python function and library from a .NET project, without requiring deep knowledge of Python's API.

Working out how to maintain and balance references between a GC/RefCnt and the CLR is hugely complicated and going to result in either memory leaks or segmentation faults unless you really, really know what you're doing.

Should make a difference NOT calling GIL.Aquire at least 50.000 times.

The current GIL context wrapper is already recursive, so if you wrap the GIL once, you can call 50,000 times inside that context and it won't acquire and release the GIL each time anyway.

Holding the GIL too long is a problem and Py_Decref should be called with the GC held so we have special code to queue object disposable to avoid deadlocking.

Python.NET has a lower-level API, have you looked at that implementation as well?

minesworld commented 1 day ago

OK, usage of unstable CPython API is a good argument...

What about making the stable API public, the rest protected?

This way a developer can "consume" the stable API in his namespace, if someone really wants to use the unstable, he "extends" the runtime. And by extending the runtime the one who does is responsible. Might be no problem for those who use their fixed version of the CPython DLL ...

The problem is: there should be an easy way to write some "fancy" stuff NOT supplied by your PyObject (PythonObject called by me).

Lets say I want to avoid "mass creation" of strings during dictionary lookups, know the needed keys beforehand. So the "logical" thing would be:

using CSnakes.Python API like this:

var key = aPyDictionary.Keys[0]; var fastValue = aPyDictionary[key];

will be key a string or PyUnicode object? Guess it would be PyDictionary<string,somewhat> generated. And PyDictionary<PyUnicode,somewhat> might work for caching, but might have issues on the WinUI side...

So the best place for a "generic" solution is the wrapping function for PyObject_GetItem ...

BTW: PyDictionary is caching via _dictionary? That would create headaches "value wise" for me... besides creating unneeded objects... . Even in pure C# code I can watch the GC heap grow while scrolling virtualized views...

But: it might be a valid performance strategie to apply for others.

C# has AsyncLocal which could be a solution for a ContextManager, implementing an Contract/Optimization-stack on a list itself, thus hopefully without capturing unneeded variables etc. Should be fine using like this:

using(new ContextScope(Use_PyDictionary_dictionary)) { [...] }

var keys = new List { "A", "B", "C" } ; using(new ContextScope(new CacheKeysAsPyObjectsEqual(keys))) { [...] }

So the questions would be: