Open steve-s opened 1 year ago
Related to that are global (from the C point of view) references to Python objects. They are also unknown to the runtime, so tracing GC would not know to mark them as GC roots.
You're making the assumption that either tp_traverse is not implemented or is incomplete. I would call it a bug in C extensions.
CPython may provide a debug mode to detect bugs in C extensions. For example, it would be great to be able to detect a tp_traverse implementation which doesn't traverse the type for instance of heap types. So far, I failed to design such checker (in an efficient way).
You're making the assumption that either tp_traverse is not implemented or is incomplete. I would call it a bug in C extensions.
Even if it is correct, it would be great if was supposed to do the one thing it should do and nothing else, such that some alternative Python implementations would not even need to call it if they can do that thing by themselves. E.g., .NET or Java or other GC language based Python implementation that wants to reuse the host language GC. Most such languages do not provide extension points for the tracing part of the GC and one approach is to build a "shadow" graph of managed objects that are visible to the GC. For that, the runtime must be notified about object graph changes on the native side.
CPython may provide a debug mode to detect bugs in C extensions.
If the runtime gets notified about object graph changes on the native side, one could check the contract fully, i.e., that it traverses exactly what it should. The HPyField design allows that and we plan to implement that check in the HPy debug mode.
Related: https://github.com/capi-workgroup/problems/issues/36
Are tou suggesting to change C API design to not expose reference counting? To support GC different than the one used by CPython.
In general I believe that if the runtime knows about the object graph, i.e., changes in references between objects must be channeled though some API and cannot be done behind the runtime's back, it can be useful for numerous things.
The obvious one is classical generational GC and/or classical concurrent GC, which require some read/write barriers. However, I think another use case could be debug mode and checking that tp_traverse
does what it should (visit all the referenced objects and nothing more or less). And there are probably more, on top of my head, for example, heap dumps that would allow you to visualize and browse the object graph.
Are tou suggesting to change C API design to not expose reference counting?
https://github.com/capi-workgroup/problems/issues/12
Not exposing ref-counting would be the most important step in order to be able to swap ref-counting for GC or anything different to the current ref-couting, like some hybrid approach or I don't know what. The point is that ideally the API should not block the runtime from implementing another memory management strategy.
Continuing with the "the API should not block the runtime from implementing another memory management strategy": not knowing about the changes in the object graph is another problematic thing that would prevent Python (any implementation) from fully taking advantage of some already well-known, well-researched, and widely used in the industry GC approaches.
Think about nogil and concerns about its compatibility. With API that does not expose reference counting this would be a non-issue. With API that allows to eaves drop on changes in object graphs, maybe it could even pull out some better optimizations of the current reference counting in CPython to work better with real multithreading.
Even if CPython currently cannot take advantage of this (because it will have to support existing extensions, at least for some time), my take is that one of the aims of this repo is to collect requirements for some ideal API. I think that this should be part of such API. I understand that it is probably not realist with the current CPython API.
To support GC different than the one used by CPython.
I have only basic knowledge of CPython's GC. Putting aside no-gil, wouldn't it be possible to exploit this in CPython as well in theory? Like you can partition heap to segments and have some "dirty" bit for a segment. On a "field" write you use some fast bit masking to flip on the dirty bit of affected segment. When doing cycle detection you don't need to scan segments that were not touched (no "field" writes) since last GC or something along these lines.
Proposed solution: https://github.com/faster-cpython/ideas/issues/553 (Grand Unified Python Object Layout) (Mention in the “revolution” repo: https://github.com/capi-workgroup/api-revolution/issues/8)
Old-ish thread, but I was looking into something related and stumbled here. Figured it might be useful for future discussions
You're making the assumption that either tp_traverse is not implemented or is incomplete. I would call it a bug in C extensions.
Regarding "incomplete", the docs for tp_traverse
say:
Note that
Py_VISIT()
is called only on those members that can participate in reference cycles. Although there is also aself->key
member, it can only beNULL
or a Python string and therefore cannot be part of a reference cycle.
(For more context, it does suggests calling Py_VISIT()
on all owned references, it is still explicitly allowed to skip certain owned references.) So this takes us to the original post - the object graph is opaque to the runtime, and tp_traverse
is not sufficient to, for example, implement a tracing GC
Regarding "not implemented", I don't think it's actually a documented requirement for C extension types that could form cycles to actually support cyclic GC (through Py_TPFLAGS_HAVE_GC
and tp_traverse
)? I.e. it's currently valid behaviour for a C extension to leak cycles
(For more context, it does suggests calling Py_VISIT() on all owned references, it is still explicitly allowed to skip certain owned references.)
The traverse function design is for the current CPython GC implementation.
The GC mostly need to know which objects contain other objects, but only the ones which are tracked by the GC. If you have a list of integers, since integers are not tracked by the GC, the traverse function doesn't have to visit these integers.
Instances of custom types can stash any references to other Python objects anywhere in their memory without telling the interpreter about it (they only should incref/decref). This makes the object graph opaque to Python runtime. Right now, it is solved by users provided
tp_traverse
andtp_clear
.tp_traverse
/tp_clear
can do. It can have bugs and miss some objects, it can do some extra work that it shouldn't blocking the caller for too long.tp_traverse
/tp_clear
is called on a Python thread (that holds GIL). This means that GC cannot run concurrently with the application in a different thread, for example.The same applies to module state.
Related: https://github.com/capi-workgroup/problems/issues/12