Closed minesworld closed 3 weeks ago
Much work to do - I know. As I've done some parts of it already in my "minesworld" branch here: https://github.com/minesworld/CSnakes/tree/minesworld
This branch already has refactored the CAPI - but my bad that I didn't looked at the pythonnet internals before doing so. So instead having those genious NewReference
, BorrowedReference
... I did only alias the nint
to pyoPtr
and created a MPyOPtr
subclass of SafeHandle
which also acts as the base class for the Runtime.Python
classes...
Refactoring ThrowPythonExceptionAsClrException
comes next - came up with that idea while writing the issue above.
My "minesworld" branch is forked from 6bfe4ff5 and I was going to implmenet CSnakes main changes up to current 513fa03, but then thought that the chances having those changes pulled would be greater by quickly filing an "Issue" ...
At the end I spent 12+ hours doing so. Just mentioning to show that I'm serious about the above issues.
@tonybaloney please make a decission in which direction CSnakes should go as incorporating from your main branch will get more difficult with each "major" refactoring done in a branch...
@tonybaloney a personal "note": your github page says:
0.10X developer. Python Advocate at Microsoft. Python Software Foundation Fellow. Fellow of Macquarie University, Sydney
Great for you. And CSnakes too as its get "some attention" by you having those connections and promoting it...
But with stating those connections comes some responsibility. Every repository you put online "falls back" to the mentioned organizations. Don't blame me - it is like it is...
Its beyond my understanding why CSnakes isn't an official Microsoft proejct, hopefully you can work during your paid time on it (which would mean that it is an inofficial MS project until is mature enough to be an official one...) .
Truth is - I have a hard time understanding whats going on at Microsoft. Having switched to Windows after being a postchild Mac fanboy for 25 years. Earned my living adminstrating Macs and developing software...
There were many reasons to switch over the time, but the final straw was I wanted to use the Microsoft Studio (best computer EVER) and developing software wih pen input. But at the Windows AppSDK / WinUI3 they are "thinking" about putting PenInput on the roadmap. Very "funny" ...
At least Apple knows how to sell their own hardware while Microsoft has a Surface line Apple users can only dream of, but there seems no easy way to use those fantastic computers to their full potential. What a shame.
PS: as you might already reach to conclude - I am something like a "programming language fetishist". And briding two different languages with different idioms is hard to be done the right way...
PPS: accessing via brackets - as x = o[a] works on CSnakes.Runtime.Python C# classes the same as on python when an object implements both dict and list interfaces? Too bad Guido didn't think about that when designing Python as there is an easy solution to that problem...
As mentioned before, I'm not making major changes to the goals of the project.
Stop "inventing" CPython C-API calls
the CPython CAPI convention is that calls begin with Py or Py[Type]
To make it clear if a call is made into CPythons CAPI those prefixes should be only used for those calls.
CSnakes.Runtime
instead defines calls itself, likeCSnakes.Runtime.CPython.CPythonAPI.PyBytes_FromByteSpan
as a wrapper forPyBytes_FromStringAndSize
even worse, as CSnakes defines an C# object wrapper PyObject . And there are calls like
CSnakes.Runtime.CPython.CPythonAPI.PyBytes_AsByteArray
which take that C# PyObject instead a CPython CAPI pointer to the PyObject C struct.Of course there is no
PyBytes_AsByteArray
within CPythons C-CAPI.Lets look at a convinience function from CSnakes.Runtime.CPython.Bytes.cs implements in
CPythonAPI
and CSnakes.Runtime.CPython.Object.cs implements in
CPythonAPI
Two problems with that:
PyObject_IsInstance
call, some might think thatIsBytes
will make a direct call to CPythons C-CAPI . But this NOT the case.IsBytes
callsCPythonAPI.PyObject_IsInstance
which does additional error checking, even is able to throw an exception which NO CPython C-CAPI does.Another "ugly" thing is the usage of C# namespaces. As
CSnakes.Runtime.Python
definesPyObject
CSnakes.Runtime.Python
is a "user" ofCSnakes.Runtime.CPythonAPI
, sits "atop" of itCSnakes.Runtime.Python.PyObject
wraps the CPytons C-API PyObject C struct using a IntPtr to itthere should be a clear line between both,
Runtime.Python
andRuntime.CPythonAPI
-Runtime.Python.PyObject
should NOT be used from withinRuntime.CPythonAPI
.Instead
Runtime.CPythonAPI
should ONLY either use IntPtr or nint as references to CPythons PyObject C struct.What is PyObject ?
There is a saying:
A
PyObject
is aPyObject
is aPyObject
Is this really true for CSnakes ? Lets make the test:
A
C struct PyObject
is aCSnakes.Runtime.CPythonAPI nint pointer
to it, is aCSnakes.Runtime.CPythonAPI IntPtr
wrapper to it, is aRuntime.Python.PyObject
C# object, wrapping thatSo in CSnakes
PyObject C# object
is a Python C-CAPIC struct PyObject
... ?!?
To my (limited) knowlegde there is only one dotnet Python project which could name its Python base class object
PyObject
: IronPython as it is a native dotnet implementation of Python.All dotnet projects "wrapping" an underlying non CLR python runtime should NOT name its C# Python object base as the runtimes base name. Period.
Otherwise it would be confusing. And CSnakes is...
URGENT !!
CSnakes needs
to solve the design and resulting usage problems described above.
naming convention
Technically current
Runtime.Python.PyObject
is a C# object wrapping anIntPtr
. So the technically correct naming would bePyObjectWrapper
. Which isn't that "beautifull"...Naming that
PyObjectWrapper
CPythonObject
would refer technically to the CPython runtime implementation but is ambiguous from a language use view as a C# object isn't a C object.So calling the technically correct
PyObjectWrapper
asPythonObject
is correct from both the technically and language view points.Refactor the
CSnakes.Runtime.Python.PyObject
base class toCSnakes.Runtime.Python.PythonObject
and all other Python.PyXXX classes accordingly.The use of Py_ and PyXXX name prefixes is EXCLUSIVE to CPython C-API and such the direct DLL calls.
But from that restriction some problems arise. Lets take a look at CSnakes.Runtime.CPython.Bytes.cs 👍
PyBytes_FromByteSpan
describes what it does, but break the above rule. Even more the CPython maintainers are free to define a function with the same name... .Leaving the underscore away, such renaming
PyBytes_FromByteSpan
toPyBytesFromByteSpan
would minimize the risk of CPython maintainers of making that an public C-API call.But the only "real" solution is to rename in a technically correct way by using human language semantics. Here the forbidden
PyBytes_FromByteSpan
would become an also technically correctByteSpanToPyBytes
.Refactor
CSnakes
that naming respects the protected CPython C-CAPI names without being ambiguous about the nature of the used objects.The current definition of
PyObject_IsInstance(PyObject ob, IntPtr type)
within theCSnakes.Runtime.CPython.CPythonAPI
class breaks the PyXXX prefix rule, might throw exceptions and has a parameter of typePyObject
which is a C# wrapper. AlsoCSnakes.Runtime.CPython
doesn't include thePyObject
class, it is defined in theCSnakes.Runtime.Python
namespace.functions and function parameters should only used types defined within that namespace or sub-namespaces
The callees of
CSnakes.Runtime.CPython.CPythonAPI
PyObject_IsInstance(PyObject ob, IntPtr type)
are all within theCSnakes.Runtime.CPython
namespace, all in functions defined likeIsBytes(PyObject p)
. This usage pattern shows thatPyObject_IsInstance(PyObject ob, IntPtr type)
should be namedIsInstance(PyObject ob, IntPtr type)
. But still - wrong namespace, it should be inCSnakes.Runtime.Python
as the calling functions likeIsBytes(PyObject p)
.Refactor
CSnakes
that functions reside wihtin the correct namespace, the one which defines the types of parametersinter API call convention
CSnakes design implies a top-down functional usage hierachy as:
CSnakes.SourceGeneration
CSnakes.Runtime.Python
CSnakes.Runtime.CPython
As such
CSnakes.Runtime.Python
might useCSnakes.Runtime.CPython
, but should not the other way around. This way each namespace should be able to run without dependencies on upper ones. Also bottom-top calls make it tempting to define functions in the wrong namespace.only top-down functional type definitions: like
CSnakes.Runtime.Python
can useCSnakes.Runtime.CPython
, not the other way only top-down functional calling:: likeCSnakes.Runtime.Python
can useCSnakes.Runtime.CPython
, not the other wayRefactor
CSnakes
to only use top-down functional type definitions and callsC# namespace usage convention
CSnakes usage hierachy correspondends to a "worry" usage hierachy from its users (developer) point of view. From hard to easy:
CSnakes.Runtime.CPython
-- CPython PyObject C struct used as nint and IntPtr must be managed via Py_IncRef and Py_DecRef calls -- also the GIL has to be aquired before and released after calls toCSnakes.Runtime.CPython.CPythonAPI
static API. -- no type conversions like CPythons C-CAPI PyList to C# ListCSnakes.Runtime.Python
-- lifetime management is handled byPyObject
which is based onSafeHandle
-- the GIL will managed by the implemented function itself -- type conversions like CPythons C-CAPI PyList to C#CSnakes.Runtime.Python.PyList
CSnakes.SourceGeneration
-- adds some magic creating classes using the classes defined inCSnakes.Runtime.Python
from python source codeAs such
functions defined
CSnakes.Runtime.CPython
should not manage the GIL for usage within such a functionCSnakes.Runtime.Python
code manages the GIL and lifetime of objectsLets take a look at CSnakes.Runtime.Python.PyObject.cs which defines for
PyObject
Where and where is this function used? From
CSnakes.Runtime.CPython
asthrow PyObject.ThrowPythonExceptionAsClrException();
CSnakes.Runtime.Python
asthrow ThrowPythonExceptionAsClrException();
Correct is that
ThrowPythonExceptionAsClrException
manages the GIL as its defined within theCSnakes.Runtime.Python
namespace.Wrong is the name of the function as C# naming and usage pattern is
throw new Exception();
. So the correct name would bePythonExceptionAsClrException
.But even then - from the
CSnakes.Runtime.CPython
is a bottom-top call.One might argue that
PyObject.ThrowPythonExceptionAsClrException
is an exception and such might break rules. But that doesn't need to be the case.ThrowPythonExceptionAsClrException
does three things:CSnakes.Runtime.Python.PyObject
and creates an Exception objectCSnakes.Runtime.CPython.CPythonAPI.PyErr_Clear()
CSnakes.Runtime.PythonInvocationException
which uses the created Exception object asInnerException
This can be spitted those two parts, where
CSnakes.Runtime.CPython
takes the needed informations as nint/IntPtr and calls PyErr_Clear()CSnakes.Runtime.Python.PyObject
which manages the GIL while wrapping toPyObject
Given that it is
PythonEnvironment
ofCSnakes.Runtime
has top-down access to bothCSnakes.Runtime.CPython
and`CSnakes.Runtime.Python
namespace usage and creates theCSnakes.Runtime.CPython.CPythonAPI
object, the correct implementation would be something like:so
PythonEngine
ofCSnakes.Runtime
can createCSnakes.Runtime.CPython.CPythonAPI
object likeand throwing an error within the
CSnakes.Runtime.CPython
is done viawhere as from
CSnakes.Runtime.Python
its(Note: code above doesn't work with passing the optional
message
parameter yet....)Refactor
CSnakes.Runtime.Python.PyObject.ThrowPythonExceptionAsClrException
using the correct name and namespaces**The
PythonInvocationException
is a subclass ofException
and is located in theCSnakes.Runtime
namespace. As of now its only used fromCSnakes.Runtime.Python
and indirectly byCSnakes.Runtime.CPython
. With the fixes applied its only called fromCSnakes.Runtime.Python
and should moved into that namespace.Refactor
PythonInvocationException
moving it into theCSnakes.Runtime.Python
namespaceThere exists a
CPythonAPI
class within theCSnakes.Runtime.CPython
namespace.CSnakes.Runtime.CPython.CPythonAPI
is somehow redundant.CAPI
is sufficient.Refactor
CPythonAPI
ofCSnakes.Runtime.CPython
toCAPI
Why the urgency?
Simple: each day passing and such possible commits done might add sources to be refactored. At some point that work would be such huge that it
Even if CSnakes states
_Warning
This project is in prototype stage and the API is subject to change._
it looks like from the comments and issus the people are using it in real production projects. And thanks to CSnakes having a NuGet package and Tonys promotion the user base might grow rapid.
For the NuGet package uses which are restricted to
CSnakes.Runtime.Python
those changes should be non-breaking. But there are users of the github sources to have access toCSnakes.Runtime.CPython
(like me). And for those changes are breaking...Additional thoughts
While pythonnet project has a different scope, it sources show some interesting concepts.
Like instead of having the CPythons C-API calls using
nint
orIntPtr
special types are defined:NewReference
BorrowedReference
StolenReference
Those are defined as C# structs containing a
Pointer
of typeIntPtr
to the CPython C-API C PyObject struct.The beauty of this is
My guess is when CPython source code maintainers see this, they are envious as the would like to use some C typedefs but: to much work and will break stuff...
pythonnets C# object based wrapper class which enables automatic lifetime is
PyObject
. Which holds the pointer to the CPython C-API PyObject struct asprotected IntPtr rawPtr
. As pytonnetPyObject
definesinternal BorrowedReference Reference => new (rawPtr);
BorrowedReference
definespublic static implicit operator BorrowedReference(PyObject pyObject) => pyObject.Reference;
the C-API calls defined like
nint PyObject_Size(BorrowedReference pointer)
can be done using the
PyObject
as parameters as the conversion is done implicitly. That is really nice as this makesusing
literally CPythons C-API easier in C# then in C which is pythonnets domain. At the cost of creating a new struct for each C-API call...BTW: has pythonnet a
PyList
object? Of course. Does is define anAppend
method? Of course...Another big difference between pythonnet and CSnakes is where the C-API external functions are stored and how the are called:
CSnakes.Runtime.CPython.CPythonAPI
instance object holds them asstatic
, the calls are done referencing them directly likeCPythonAPI.PyObject_Size(xx);
Runtime
class which includes ainternal static class Delegates
that holds the calls (as static). SoRuntime
does a mapping likeinternal static nint PyObject_Size(BorrowedReference pointer) => Delegates.PyObject_Size(pointer);
The best of both worlds: what to learn from pythonnet for the future of CSnakes
As CSnakes IS going to be THE one-stop solution for Python NuGet packages depending on CSnakes ( people are already asking for specific packages here... https://github.com/tonybaloney/CSnakes/issues/296 ) there will be many NuGet packages. The promise of CSnakes is "just include it", even multiple packages at once.
So the unavoidable will happen
There will be NuGet packages requiring specific versions of Python which can not be used togehter in a single project
As CSnakes exposes only
CSnakes.Runtime.Python
this is like a "broken promise", a "show stopper" which should be avoided at all costs.The solution would be that the DLL calls are kept like in pytonnet in a
Delegate
, but an instance instead of a static class. CSnakes.Runtime.Python classes would not only hold a reference to the C-API Python object but also to that Delegate, e.g. interpreter which created the wrapped object.If CSnakes.Runtime.Python classes want to call the C-API and use another Runtime.Python instance as an argument, the Delegate have to be the same. Otherwise a "fallback" might be used (like using the dotnet frameworks functions to compare by converting the values from Python to it) or a
PythonInvocationException
is thrown.Refactor to make the use of multiple CPython versions in parallel possible
If the existing Python NuGet packages aren't suitable for loading different DLLs at the same time due to different compiler flags or whatever - we should ask the maintainer to fix that. Realiy will look something like: if there is only one fancy "must have" AI package, the downloads of the CSnakes "preferred" Python package will outnumber every other provider by factors...
As a last resort the user of CSnakes could compile all needed Python Version themselves with the same flags, libraries etc. ... For many it will be worth the effort if given a step-by-step documentation or project to do so.
Python being the #1 data scientiest and AI developers language choice there are already many specialized CPython data structures provided as modules. numpy being the best known. But CPython subinterpreters going to get "real" there is a prototypes of Copy-On-Write module which enables some sort of sharing between subinterpreters implemented as a module written in C .
To use some of those special datastructures from C# the "right way" will require using
CSnakes.Runtime.CPython
down to the level where the layout of the C struct must be known to the C# code.No problem for the end user - a NuGet package would provide the CPythons C-module and the correspondening C# class. A huge plus point for using CSnakes ...
Refactor to make
CSnakes.Runtime.CPython
publicAs such new C#
CSnakes.Runtime.Python
based classes will be there wrapping CPython object of specific types, there must be a mechanism to create the C# objects. Also theSourceGenerator
might be more flexible to use those then.Make converting CPython to C# objects configurable
Make the
SourceGenerator
to be extensible to use new C#CSnakes.Runtime.Python
subclassesAs of now developers using
CSnakes.Runtime.CPython
calls would have to manage the lifetime of the CPython objects themselves (Py_IncRef, Py_DecRef). In pythonnet they would use thePyObject
base class with takes that burden away, in CSnakes its in the CSnakes.Runtime.Python` namespace.It is possible to refactor CSnakes in a way that the a
SafeHandle
based class is usable fromCSnakes.Runtime.CPython
and its objects given to the CPython C-API calls... . Developers could use those with "using
" to write less and memory-leak free low level code.A clean design would look something like
CSnakes.Runtime.CPython.CAPI
class --NewReference
,BorrowReference
... having aIntPtr Pointer
fieldCSnakes.Runtime.CPython.API
, subclass ofCSnakes.Runtime.CPython.CAPI
--ReferenceObject
based onSafeHandle
, having aIntPtr
and aReference
like pythonnetPyObject
doesCSnakes.Runtime.CPython.API.ReferenceObject
-- would hold something like aCAPIDelegate
objectThis way functions using the CSnakes.Runtime.CPython.API
or
CSnakes.Runtime.CPython.CAPImust be given the
CAPIDelegateinstance and calling the CPython C-API would look like
capiDelegate.PyObject_GetSize(x);. Its up to the caller of such functions to ensure that all used C-API python objects originate from the same
CAPIDelegate` / CPython instance.Refactor
CSnakes.Runtime.CPython
to have aCAPI
and anAPI
class based interface of which theAPI
enables using theusing
keyword to let the CLR manage objects lifetime.Another "sideeffect" benefit would be that existing pyhtonnet code would be adopted more easily. In best case scenarios like replacing pythonnet (wrongly named)
PyObject
with (correctly named)ReferenceObject
... which would be true for both the developer using CSnakes as developing CSnakes itself.pythonnet license is MIT, so "borrowing" sources is a "non issue" by referering from where its originating...
The
ReferenceObject
class should include the C# convinience functions likeEquals
asCSnakes.Runtime.Python.PyObject
does to make code more readable. But methods likeGetAttr
etc. should be left out as those should be only in theCSnakes.Runtime.Python
namespae. Such must be used asPyObject_GetAttr
...