pybind / pybind11

Seamless operability between C++11 and Python
https://pybind11.readthedocs.io/
Other
15.11k stars 2.05k forks source link

[FEAT] being able to enforce the dereferencing of the smart pointer in argument conversion #2751

Open t-vi opened 3 years ago

t-vi commented 3 years ago

Thank you for making PyBind11!

I have a relatively complex ("externally given") class with a funny ownership model, so things can go missing while Python has a reference. To wrap it in Python safely, I'm trying to hold instances in a smart pointer (a la shared_ptr) that checks on dereferencing whether the referred to object still exists. It does so by another pointer indirection.

The trouble is that while PyBind will create the shared ptr, I cannot seem to enforce that argument conversion always goes through the smart ptr and not the raw ptr held alongside.

I tried to do a custom caster, but that ran into "specialization after instantiation" error messages.

There is an "obvious path" to go through lambdas that all take the smart pointer and dereference themselves, but this is obviously a bit cumbersome when it needs to be done to a large number of methods.

@YannickJadoul looked at this on gitter to help clarify my thoughts (but all errors are mine). Thank you Yannick!

#include <pybind11/pybind11.h>
#include <iostream>

template <typename T>
struct Wrap {
  Wrap(T* p) : elem(p) {}
  void clear() { elem = nullptr; }
  T* elem;
};

class MyClass {
public:
  MyClass() : wrap_(std::make_shared<Wrap<MyClass>>(this)) {
  }
  std::shared_ptr<Wrap<MyClass>> wrap() {
    return wrap_;
  }
  ~MyClass() {
    wrap_->clear();
  }
private:
  std::shared_ptr<Wrap<MyClass>> wrap_;
};

template <typename T>
class unwrapping_shared_ptr {
 private:
  std::shared_ptr<Wrap<T>> impl;

 public:
  unwrapping_shared_ptr() = default;
  unwrapping_shared_ptr(std::shared_ptr<T> p) : impl(p) { std::cerr << "construct shared_ptr\n"; }
  unwrapping_shared_ptr(T* p) : impl(p->wrap()) { std::cerr << "construct T*\n"; }
  T* get() const {
    if (! impl->elem) {
      throw std::logic_error("has been invalidated");
    }
    std::cerr << "get deref!\n";
    return impl->elem;
  }
  T** operator&() {
    if (! impl->elem) {
      throw std::logic_error("has been invalidated");
    }
    std::cerr << "& deref!\n";
    return &(impl->elem);
  }
};

PYBIND11_DECLARE_HOLDER_TYPE(T, unwrapping_shared_ptr<T>, true);

namespace py = pybind11;

PYBIND11_MODULE(test_ext, m) {
  py::class_<MyClass, unwrapping_shared_ptr<MyClass>>(m, "MyClass")
    .def("__str__", [] (MyClass& m) { return "MyClass - jolly good"; });

  m.def("get_something", [] () { return new MyClass(); });
  m.def("delete_something", [] (MyClass* m) { delete m; });
  m.def("test_something", [] (unwrapping_shared_ptr<MyClass> m) {
    if (m.get()) {
      return "alive";
    } else {
      return "just resting";
    }});
}

and then in Python this gives me:


In [1]: import test_ext

In [2]: t = test_ext.get_something()
construct T*

In [4]: test_ext.test_something(t)
get deref!
Out[4]: 'alive'

In [5]: test_ext.delete_something(t)

In [6]: test_ext.test_something(t)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-6-34740f9db26d> in <module>
----> 1 test_ext.test_something(t)

RuntimeError: has been invalidated

In [7]: str(t)
Out[7]: 'MyClass - jolly good'

In [8]: 

Ideally, in [7] it would try to dereference the smart pointer, but it does not.

YannickJadoul commented 3 years ago

To be clear, I'm not making any promises (it's very tricky to access the holder if you don't know what the holder is, since pybind11 type-erases the holder object in ), but it's an interesting angle to take into account when working on the holders: should we always try to access contents of an object through holders? Would that solve some other things? So I'm giving it that label (ping @EricCousineau-TRI, @rwgk, @rhaschke).

Meanwhile, I'll look at the custom type_caster solution.

rwgk commented 3 years ago

should we always try to access contents of an object through holders? Would that solve some other things?

IIUC @rhaschke's comment from a couple weeks ago correctly, the raw pointer is also used in some registry, I hope Robert can chime in (I'd have to dig up that comment).

But assuming that can be resolved (if necessary), I think it's a great small-scale experiment to replace the raw pointer lookup with dereferencing the holder, and see if that has any measurable impact on runtime performance. (I feel like it will not, but you never know before you try.)

YannickJadoul commented 3 years ago

AFAIC, I'm not even sure this should be "fixed", because I'm not sure pybind11 promises anything like this (so it would need a discussion about expected behavior, etc). But it's interesting to keep in the back of our minds, when looking at the other holder types, right? Especially if we notice this results in further confusion, it's worth considering making a semi-breaking change.

small-scale experiment to replace the raw pointer lookup with dereferencing the holder

And yes, I was thinking that a pointer to the value most likely takes as much memory as a pointer to a function that extracts the pointer to the value from the holder (again, things are type-erased, so not that obvious at runtime, because you don't know the type of the holder, so you need some kind of function stored to do so, I think!).

rhaschke commented 3 years ago

I ran into this issue before as well. Having a custom holder type, you would actually expect that the holder's get() function is called each time its value is accessed. But that's not the case, because, in the value_and_holder struct, both the original pointer as well as the holder are stored and - as a shortcut - only the raw pointer is used in most cases. However, that's not the only issue here: In @t-vi original post, a Python-managed object instance is passed to C++ and deleted there. This should not only delete the C++ part of the instance, but also the Python-part, i.e. the wrapper instance as well as its corresponding entry in pybind's internal registered_instances. Furthermore, in the example code, ownership is not clear at all: Passing a raw pointer to delete_something() doesn't grant ownership and thus the right to delete the pointer!

t-vi commented 3 years ago

However, that's not the only issue here: In @t-vi original post, a Python-managed object instance is passed to C++ and deleted there. This should not only delete the C++ part of the instance, but also the Python-part, i.e. the wrapper instance as well as its corresponding entry in pybind's internal registered_instances.

Well, so I cannot destruct python instances, and my goal here is to invalidate them through the smart pointer. This would work if the holder's get() function were called each time its value is accessed.

Furthermore, in the example code, ownership is not clear at all: Passing a raw pointer to delete_something() doesn't grant ownership and thus the right to delete the pointer!

This is "just" the minimal example. So what I would like to do is call clear() on the wrapper (or I could make a clear method on the smart pointer) and then subsequent Python conversions should fail. (I don't expect a solution for the lifetime issue of clearing while someone else still has the pointer.)

YannickJadoul commented 3 years ago

@t-vi, soooo, it's harder than I thought it would be, because of an implementation detail: the default holder_type<T>'s type_caster<holder_type<T>> inherits type_caster<T>. And I wanted to try create type_caster<T> by subclassing (or at least using) type_caster<holder_type<T>>. That's a pretty circular dependency.

Good news is, it's possible, though, by cutting and pasting relevant parts of copyable_holder_caster<T>. Bad news is, it looks pretty horrible; please don't try this at home. It's also (probably?) slightly hacking into implementation details, but then the whole type_caster thing is in detail and (in principle) bound to change without warning, I think. Good (?) news on that level is: no one's going to want to touch that code, anyway, so it's pretty stable...

Now, without further ado, I reveal to you, the monster hack:

namespace pybind11 {
namespace detail {

template <>
struct type_caster<MyClass> : public type_caster_base<MyClass> {
public:
    using type = MyClass;
    using holder_type = unwrapping_shared_ptr<MyClass>;

    bool load(handle src, bool convert) {
        return load_impl<type_caster<MyClass>>(src, convert);
    }

    explicit operator type *() { return static_cast<type *>(value); }
    explicit operator type &() { return *static_cast<type *>(value); }

protected:
    friend class type_caster_generic;

    bool load_value(value_and_holder &&v_h) {
        if (v_h.holder_constructed()) {
            value = v_h.template holder<holder_type>().get();
            return true;
        } else {
            throw cast_error("Unable to cast from non-held to held instance (T& to Holder<T>) "
#if defined(NDEBUG)
                             "(compile in debug mode for type information)");
#else
                             "of type '" + type_id<holder_type>() + "''");
#endif
        }
    }
};

}
}

which gives you, I believe, the output you expected?

import example

t = example.get_something()
example.test_something(t)
example.delete_something(t)
try:
    example.test_something(t)
except Exception as e:
    print(e)
print(str(t))

gives output

get deref!
get deref!
has been invalidated
Traceback (most recent call last):
  File "/home/yannick/tmp/sandbox/test.py", line 10, in <module>
    print(str(t))
RuntimeError: has been invalidated

I want to say "you're welcome", but ... I don't know if this is such a nice present, actually :sweat_smile: So I'll just keep it at: "enjoy" ;-)

t-vi commented 3 years ago

Thank you! :gift: :christmas_tree: This is exactly what I had in mind and it will be useful to me.

t-vi commented 3 years ago

It turns out that with @YannickJadoul 's neat hack, PyBind still appears to cache the "Raw Pointer -> Holder" resolution, so if I delete a C++ object and allocate a new one, the holder isn't obtained through the .get but through the cache, i.e. we get the wrong (old) one. So I'm looking into finding either a way to remove an entry from the cache or to specify a custom casting method that avoids caching...

YannickJadoul commented 3 years ago

Copying from the Gitter discussion:

Yannick Jadoul
@YannickJadoul
21:56
Oh! I think I got it!
@t-vi, yes, this is what happens:
Thomas Viehmann

@t-vi
21:58
I'm on the version used by PyTorch (2.6.1 + a handful of cherry picked).

Yannick Jadoul
@YannickJadoul
22:00

    1. Create something with this nullable holder and return it to Python.
    2. The mapping from this C++ object's address to the Python object gets stored when first creating the Python object.
    3. Destroy the C++ object; the Python object stays in scope.
    4. Return a new C++ object with the same address as before. The mapping is still there (no one removed that cache, because ... who would? The Python object is still alive.), and you return the Python object wrapping the old (deleted) object.

Damn
Cfr. disabling that cache: that's not a pure optimization
It also makes sure that if you return the same C++ object multiple times, that you will get the same Python object

Thomas Viehmann
@t-vi
22:04
Yes, that matches my observations. so I would have a unique address that could be used (that of the wrapping object (not the holder)), but I cannot seem to get the casting to do that.
t-vi commented 3 years ago

Having wrap::clear remove the registered instances appears to work around this issue. So wrap::clear would call clear_registered_instances(elem) defined as follows:

void clear_registered_instances(void* ptr)
{
  auto &registered_instances = pybind11::detail::get_internals().registered_instances;
  auto range = registered_instances.equal_range(ptr);
  for (auto it = range.first; it != range.second; ++it) {
    auto vh = it->second->get_value_and_holder();
    vh.set_instance_registered(false);
  }
  registered_instances.erase(ptr);
}

I'll work more with this added tomorrow and am hopeful that it is indeed more stable.

Fingers crossed that the memory leaks aren't too large.

Thank you @YannickJadoul for the discussion online, errors are my own doing.

YannickJadoul commented 3 years ago

@t-vi That looks about what I had in mind to try, around the time you suggested it, indeed :-)

Fingers crossed that the memory leaks aren't too large.

I'm not really expecting memory leaks. Are you? It is kind of ugly that this is in wrap, rather than the caster, though, but I don't think you can get around that? After all, the caster doesn't know if you're going to steal from it.

(@rwgk, this is going to be quite relevant for you, as well, I believe)

rwgk commented 3 years ago

(@rwgk, this is going to be quite relevant for you, as well, I believe)

FWIW: I looked at @YannickJadoul's trick a bit a couple weeks ago but gave up when I began too feel strongly it doesn't help me in my pursuit of a general solution for improved smart-pointer handling.

My work on that general solution is still very much incomplete (#2672), but the core of how it will work is already here:

https://github.com/pybind/pybind11/pull/2672/files#diff-27833a8008c9cd24dc1e2dcb96af0bd0fb1cbf68eaefde67593bea0661c5eb6bR166

Ignoring the guard rails it boils down to simply:

    template <typename T>
    T *as_raw_ptr_unowned() const {
        return static_cast<T *>(vptr.get());
    }

The vptr (v is for void) dereferenced there is created like this:

hld.vptr = std::static_pointer_cast<void>(shd_ptr);

That means there is a detour through void and back to T, but I believe that's mostly an implementation detail from a user's viewpoint. I'm thinking it will work for @t-vi 's situation but we'll need to try it out to be sure.

@t-vi I wonder if this could be interesting to you: https://github.com/pybind/pybind11/pull/2672/files#diff-311f60a15aa4e63e4bd116b584ceaa5b2d1afd392d69c42dfb539eff459e4071

This demo works with pybind11 as-is. If you go back to that "bare" interface you can do pretty much anything you want, although you're also responsible for all the internals yourself, incl. how you're holding the C++ object in a Python object. If that's too cumbersome for your situation, I'd be happy to help you try the smart_holder code, when it's a little more mature.

t-vi commented 3 years ago

how you're holding the C++ object in a Python object.

I guess looking for pybind11 to help me with this is exactly the bit that got me into trouble... If I had the opportunity to store some shared_ptr<T> or shared_ptr<void> somewhere, it'd be all I need.

t-vi commented 3 years ago

The use-case I tried to distill with the example above is this PR: https://github.com/pytorch/pytorch/pull/50326 .

rwgk commented 3 years ago

If I had the opportunity to store some shared_ptr or shared_ptr somewhere, it'd be all I need.

The smart_holder I'm in the middle of developing does that, but pybind11 as-is also does that, too. An important difference is that the smart_holder also supports disowning, via unique_ptr, which pybind11 as-is doesn't. (pybind11 doesn't even really use the holder as a holder, just something tagged on to keep a reference to the shared_ptr control block more-or-less.)

Could you please help me understand: in your use case, how is the shared_ptr disowned? (It's not possible without a custom deleter AFAIK.) Maybe your definition of "invalidation ... when C++ object is deleted" is different from the disowning as implemented in the smart_holder? I'm trying to understand the core requirement you have, do see if I'm covering that, too, or could cover it.

t-vi commented 3 years ago

I guess it is similar, but not quite the same:

rwgk commented 3 years ago

Is Wrap a type that appears in user code? If the answer is yes, I'd say the shared_ptr is never disowned, only Wrap is cleared. A mainstream way to handle your Wrap<MyClass> would be to simply wrap it:

class_<MyClass>...;
class_<Wrap<MyClass>, std::shared_ptr<Wrap<MyClass>>>...;

Give it an .unwrap() method in Python. You'd get the "has been invalidated" exception exactly when unwrap is called, which is nicely obvious. Explicit is better than implicit?

EricCousineau-TRI commented 3 years ago

I think I've paged in :sweat_smile:

My understanding is that you're using a lifetime proxy object (Wrap<MyClass>) to signal when it's dead. Sweet, that's pretty cool!!! FWIW This reminds me of how VTK's Python bindings admit natural instance tracking, thus interop with other Python bindings (e.g. pybind11) is pretty easy; see this rabbit trail of links for a StackOverflow post

Have y'all already figured out the issues if you're wrapping C++ API that actually wants MyClass directly? I'd assume you could do something like traits to intercept this, e.g. should_proxy_py_through_wrap<MyClass> = std::true_type or whatevs, and specialize all casters based on that?

That being said, if you're going this route, my guess is that it may be easier to ensure that they instance-tracked type is the only thing that pybind11 ever sees? (this is a bit of a counter to your point, Ralf, in that class_<Wrap<MyClass>> may make it hard when pybind11 runs across API surface that touches MyClass directly?)

rwgk commented 3 years ago

(this is a bit of a counter to your point, Ralf, in that class_<Wrap> may make it hard when pybind11 runs across API surface that touches MyClass directly?)

I don't know, I'd have to see the original (real) use case. But I'm thinking: in C++ the unwrap step must be explicit, too, no? If it's explicit in C++, why not also make it explicit in Python as well?

t-vi commented 3 years ago

The original usecase is in the link above.

What happens is that there is a Graph class and graph objects own Nodes, Blocks, Values. The latter three are typically passed around as raw pointers in the C++ API and the user is expected to know

  1. that calling of graph manipulation routines will change which objects exist,
  2. they need to own the graph and keep it alive to access Nodes, Blocks, Values.

While this seems to work reasonably well for C++, it presents the Python interface with two problems:

The wrapper object is only introduced for the benefit of the Python API. This causes the complication of mapping an API working with raw pointers in C++ to one that works with the wrapper in Python. While it could be made safer on C++ with some effort, some of the codepaths might be performance-sensitive not make that terribly attractive.

But I'm thinking: in C++ the unwrap step must be explicit, too, no?

No, it doesn't exist.

If it's explicit in C++, why not also make it explicit in Python as well?

That'd make for a bad API. The obvious alternative to do this by PyBind11 magic is to implement all conversions manually in the Python bindings, but my understanding was that PyBind11 wants to make this easier for me. (Obviously, one might also say the use-case is too obscure for PyBind11 to support it.)

rwgk commented 3 years ago

The latter three are typically passed around as raw pointers in the C++ API ... some of the codepaths might be performance-sensitive

Classical situation, trading safety for performance. That never meshes well with Python, no matter what wrapping system.

The wrapper object is only introduced for the benefit of the Python API.

I see, that's very different then from what I was thinking before.

I recommend against wrapping the C++ API directly. I'd look into implementing a higher-level layer on top, using smart pointers, then wrap that, but it all depends on more details than I can look at.