Open EricCousineau-TRI opened 1 year ago
Option (2) sounds best to me. I don't think we'll need enable_shared_from_this
. Remember that the "managed object" and the "pointed-to value" of a shared_ptr
need not be the same. Something like a shared_ptr<Context>
could still be a subcontext alias into a larger context, where the managed object is the full context and the stored pointer is the subcontext.
Summary
This is the core issue to #20131, with more minimal Drake example in #20380 See https://github.com/RobotLocomotion/pybind11/pull/65 for pybind-only minimal reproduction case.
If we have a pure C++ object (it is not a Python class inheriting from a C++ base), then the Python "view" of the object can get garbage collected if we pass ownership solely to C++ and the Python view goes out of scope.
This is generally not an issue, as the C++ object itself stays alive. However, this becomes a problem when the Python view of the object has additional information, such as
keep_alive<>
annotations - most importantly, it has patients that the Python view keeps alive.Example
Adapted from #20131: https://github.com/RobotLocomotion/drake/blob/20a030af9b6ae4800d95081558cd447cec4e715f/tmp/repro.py#L163-L174
The sequence of events:
context
created.plant_context
created, keepscontext
alive as its patient.collision_constraint
is created, keepsplant_context
alive, which keepscontext
alive.collision_constraint
is added totrajopt
.collision_constraint
hasref_count == 0
, thus is gc'd, along with itskeep_alive
relationships.context
alive, so it gets gc'd.plant_context
has been freed memoryRelated upstream issues
I have not searched for these yet.
Possible Solutions
1. Whenever we pass a
shared_ptr<>
from Python to C++, we still annotate it.This is not bullet proof (we can still lose
keep_alive
relationships), but would at least improve things.2. Ditch our pybind11 fork and use
shared_ptr<>
more (#5842)Even if we wholesale switched to
shared_ptr<>
and ditched ourpybind11
fork (#5842), I believe this would still be an issue unless we guarantee every C++ object shown to Python is only ever exposed asshared_ptr<>
/ (and possibly usingenable_shared_from_this
if we need to expose raw pointers).If there is ever a raw pointer that has lifetime constraints, then that necessitates
keep_alive<>
, which may fall apart in situations like this.3. When passing ownership from Python to pure C++, keep the object alive if it has any patients
This may keep things alive longer than intended, and may be slow to do the nurse-based lookup.
I am also not sure how hard this is for a pure Python object. To prevent slicing, etc., we inject our own logic into
tp_dealloc
for the derived class.I need to check if the pybind11
smart_holder
branch has a mechanism for this, but I doubt it (this is kinda niche / silly).4. Never delete anything.
This seems bad?
@jwnimmer-tri @SeanCurtis-TRI @sammy-tri @RussTedrake - do y'all have suggestions / alternatives?
Version
20a030af9b6ae4800d95081558cd447cec4e715f