Closed wjakob closed 8 months ago
Thanks @wjakob ! Indeed this is worth clarifying.
As far as I know, GIL is not necessary if the deleter is defined in C API. There can be cases where the deleter itself is defined by python's FFI mechanism, in such case, the GIL holding could happen inside the deleter, but nevertheless not the job of the caller of the deleter
The C docs clearly should not know anything about the GIL. As for the Python docs, this section talks about transfer of ownership of DLManangedTensor
to the consumer: https://dmlc.github.io/dlpack/latest/python_spec.html#implementation. The dlpack_capsule_deleter
directly uses the Python C API, so clearly already holds the GIL. The deleter can only be called once within dlpack_capsule_deleter
(there's an early exit at the top). So it seems pretty clear at the moment.
@wjakob what exact comment/phrasing would help you here?
How about the following?
When the capsule has been consumed, the consumer library is responsible for calling
DLManagedTensor::deleter
. From the viewpoint of the producer, no guarantees are given with respect to the execution context of this subsequent function call. For example, the call could take place on a different thread, and the Python GIL may or may not be held. It is the responsibility of thedeleter
implementation to handle such conditions using synchronization primitives, if needed.
It is the responsibility of the
deleter
implementation to handle such conditions using synchronization primitives, if needed.
Thanks, looks reasonable - except "deleter
implementation" here seems wrong (it's the producer that implements deleter
; you want to say that it's the caller of deleter
's responsibility.
Really? Maybe I am misunderstanding something. My mental picture is:
Suppose Framework X needs to run a few python operations to clean up tensor that is now unreferenced. It needs to acquire the GIL to invoke those Python API calls safely. However, the part of Framework Y that released the consumed tensor has no idea about Python or the GIL.
So it's the responsibility of the implementation of the deleter (provided by framework X) to first acquire the GIL before subsequent steps.
@wjakob I think your interpretation is correct. However, most of the frameworks implement the allocation of the object themselves in C/C++/cython, then expose to python, in which case the deleter donot need to acquire GIL.
It is easier for the deleter
caller to hold the GIL, because when the caller retrieves the pointer from the capsule using PyCapsule_GetPointer
it already needs to see a valid Python context. (We didn't require copying a DLManagedTensor
when the consumer first receives the capsule, so it is safer that we only consider accessing the deleter
from a capsule.)
@leofang I think part of the confusion is that there are effectively two "deleters", and perhaps it would be good to come up with some kind of terminology in the documentation to avoid ambiguity.
The first is the capsule deleter. It is invoked when the capsule expires, and it may or may not call DLManagedTensor::deleter
depending on whether the capsule is marked consumed. The GIL is always held in this case.
The second is the DLPack tensor deleter (DLManagedTensor::deleter
field). It may be called by an expired capsule (in which case the GIL is held), but alternatively it could also be called by an external framework that is now using the tensor. This call could originate from another thread and in a context where it may not be straightforward to acquire the GIL (e.g. when called by the C++ portion of another framework that doesn't even link to libpython
).
So my suggestion would be to add a sentence to the documentation saying that DLManagedTensor::deleter
may be called from an arbitrary context (where "context" refers to caller thread, and status of locks like the GIL). This means that additional checks and synchronization may be needed to safely issue Python C API calls.
Hypothetical example: what if manager_ctx
is actually pointer to a PyObject *
that backs the tensor storage. In that case, we need to issue a reference counting operation that acquires the GIL before Py_DECREF
can be safely executed. There is no harm in doing this if the GIL is already held, but it avoids undefined behavior when my_deleter
is called on another thread by a C++-based framework.
void my_deleter(struct DLManagedTensor * self) {
PyGILState_STATE state = PyGILState_Ensure();
Py_DECREF((PyObject *) (self->manager_ctx));
PyGILState_Release(state);
}
I think part of the confusion is that there are effectively two "deleters", [...]
I'm not sure it's that confusing; at least to me "the deleter" is always DLManagedTensor::deleter
. It is the only thing that is being actively called.
So my suggestion would be to add a sentence to the documentation saying that
DLManagedTensor::deleter
may be called from an arbitrary context (where "context" refers to caller thread, and status of locks like the GIL). This means that additional checks and synchronization may be needed to safely issue Python C API calls.
This sentence makes sense to me. It probably needs some context that the remark is intended for implementers of a Python library, who may not be aware that there are non-Python DLPack users. Otherwise it's more confusing than helpful for C/C++ users, who will wonder why on earth they need to bother with a lock from a language they're not using.
Hypothetical example: what if manager_ctx is actually pointer to a PyObject * that backs the tensor storage.
There is nothing hypothetical about this. This is the current way NumPy, cupy, and probably half of the other Python providers work. So this is a very important clarification that many implementations will violate as soon as it is formalized. (They probably mostly just use decref, so it is very unlikely for problems to occur in practice, but it is a clear violation.)
Py_DECREF
can trigger arbitrary python code to run, and it also uses a non-atomic decrement instruction that may race with other reference count changes done in an independently running Python thread. You really do need to hold the GIL when issuing Python API calls.
My suggestion would be that the my_deleter
example posted above is included in the documentation to clarify this potentially dangerous situation.
Yes, it seems that it should probably be written that "the deleter MUST NOT assume that the GIL is held when it is called" (for Python but applies to any other environment).
The caveat that should maybe be noted: That this is not necessarily true for current implementations. So users should ideally only rely on that for the "next" version of __dlpack__
.
Yes, it seems that it should probably be written that "the deleter MUST NOT assume that the GIL is held when it is called" (for Python but applies to any other environment).
Noisy +1 to this. This would make things clearer for exporters of non-Python data (such as PyArrow).
@tqchen @seberg Do you think we should update the docs before tagging DLPack 1.0?
feel free to send a PR
gh-140
Dear DLPack authors,
DLPack is widely used to enable interoperability between large C++-based frameworks that furthermore provide Python bindings. The C++ parts of such a framework are often usable through an independent C++ API (e.g. in the case of PyTorch, Tensorflow, ..), where functions can be called from multi-threaded code. In contrast, multithreading in Python is much more restricted: any use of the CPython API requires that the Global Interpreter Lock (GIL) is held.
This discrepancy has implications on projects mixing C++ and Python code. Function in such a mixed C++/Python codebase should clarify require whether the caller should ensure that the GIL is held.
This is currently unclear in the case of
DLManagedTensor::deleter
, which could be called by thePyCapsule
destructor (where the GIL is held) or from arbitrary C++ code at some later point following "consumption" of the capsule (where the GIL is not necessarily held -- this destructor call could even occur from a different thread!)I don't have any strong opinions either way, but ideally the documentation of the interfaces should say so.
Thanks, Wenzel