Closed tzh1043 closed 7 years ago
It's not surprising that leak()
and `leak2() leak: you're telling pybind11 that you are managing the reference, but you aren't.
vec
and copy
look the same to me (more on copy
below).
ptr
and refs
are essentially the same: pybind11 isn't managing the reference itself, but it also is using that same policy for the elements of the vector.
The issue with copy
(and tuple
) is trickier: you're returning an rvalue; pybind sees this and forces return_value_policy::move on the vector, which is fine (you returned an rvalue to pybind11; it has to either move or copy it to store it, because all it has is a temporary).
The problem comes about because container casters (such as stl.h
's list caster used for vector) converts its arguments with the return_value_policy
used on the vector itself—but we just forced that to move
, so it tries moving the pointed-at values inside the vector into new instances which pybind11 will wrap and handle. This, of course, is where we run into trouble: move
was valid on the vector itself, but isn't valid on the vector's elements. (For tuple
, unordered_map
, etc. the story will be the same).
Cc @dean0x7d on this; I'm fairly sure that #510 caused this (but, of course, it also fixed another problem). We're basically (now) forcing a move/copy policy on containers which carries through to the container's elements, which is the problem here.
As a workaround in this case, the following should work:
using ListCasterBase = pybind11::detail::list_caster<std::vector<Element *>, Element *>;
namespace pybind11 { namespace detail {
template<> struct type_caster<std::vector<Element *>> : ListCasterBase {
static handle cast(const std::vector<Element *> &src, return_value_policy, handle parent) {
return ListCasterBase::cast(src, return_value_policy::reference, parent);
}
static handle cast(const std::vector<Element *> *src, return_value_policy pol, handle parent) {
return cast(*src, pol, parent);
}
};
}}
which is basically just wrapping stl.h
's list caster with an override to force the reference
policy. (You'll need to be careful with this though: there's no dependency established between the pointers and the original object, so be sure to keep it around).
Your workaround works, thank you!
It's not surprising that leak() and
leak2()
leak: you're telling pybind11 that you are managing the reference, but you aren't.
Oh, absolutely, I just wanted to demonstrate that the best thing I could come up with (and which does not crash via a double-free) would leak.
There's no dependency established between the pointers and the original object, so be sure to keep it around.
Reading about the weak_ptr-using keep_alive
feature, it seems the raw pointers of Element
can not be tied to anything. Is the vector completely replaced by the python list object or could it somehow be tied to the lifetime of Data
? This would not make much sense for a list (what after list.clear()
) but for an immutable tuple it would at least be something.
Cc @dean0x7d on this; I'm fairly sure that #510 caused this (but, of course, it also fixed another problem). We're basically (now) forcing a move/copy policy on containers which carries through to the container's elements, which is the problem here.
Carrying over the policy to the container's elements is usually sound. If std::vector<Element>
is the return value, reference_internal
is equally bad for both the container and the elements, which are all rvalues: the references would point to temp objects resulting in undefined behavior. So moving (copying) out all the elements is the only thing that can reasonably be done with them.
Unfortunately, this breaks down with std::vector<Element*>
because the policies have a slightly different meaning for raw pointers than for values ("move your pointee" instead of "move yourself"). And this really is special behavior present only for raw pointers. Note that placing smart pointers in the vector would restore expected behavior (smart pointers know what "move" means).
A solution would be to relocate the policy selection logic. Right now, the move
policy override happens when a function is wrapped (and propagates from the container to its elements). This can also be moved so that each individual type_caster
makes the selection for itself. I think this is the most appropriate solution since it would both avoid undefined behavior for rvalues, and ownership issues for raw pointers. However, that does require that each type_caster
has both a cast(const T &)
and cast(T &&)
overloads where each does the right thing. This could result in some duplication and boilerplate between casters, which is why I wanted to avoid it in the first place. But perhaps it wouldn't be too bad. I can code it up, make a proposal and we can take it from there. Thoughts?
The type_caster
suggested by @jagerman looks like the right answer here. Trying to pursue an automatic library solution looks like it would be very messy and only benefit lists of owned raw pointers (which I really hope is just a corner case in a C++11/14/17 world). So I'll close this and refer to https://github.com/pybind/pybind11/issues/637#issuecomment-277422232 for the type_caster
solution for raw-pointer-inside-container ownership.
I ran into this problem recently. It would be nice if the error message that occurs when this occurs gave better guidance about how to fix it and/or adding extra type checks in the template definition for def to detect common cases when this situation occurs statically.
Also, another goofy workaround is to return std::vector<std::unique_ptr<Element*, py::nodelete>>
instead of std::vector<Element*>
in the function you're wrapping.
Sorry to resurrect - is this still the recommended approach for returning std:: containers of raw pointers that have internal_reference semantics?
@TomasPuverle, as far as I know, there's not been a big breakthrough on these complex issues, no.
Thank you for the follow-up.
Also, another goofy workaround is to return
std::vector<std::unique_ptr<Element*, py::nodelete>>
instead ofstd::vector<Element*>
in the function you're wrapping.
This is not working, python is still calling destructor of the Elements instead of nodelete::operator() ...
I have a
Data
class which holds pointers toElement
owned by onlyData
. I want to return these elemens in a container (list or tuple) [1].When returning the internal
vector<Element*>
container ofData
by reference everything works, but returning a container copy or a different container (tuple) does not work.Data::refs() -> vector<>&
orData:ptr()-> vector<>*
, so herepy::return_value_policy::reference_internal
means "touch neither the container nor its contents".Data::copy() -> vector<>
orData::tuple() -> std::tuple<>
, and I somehow want to express that Python then owns the container, but its contents are still owned byData
.Returning (and leaking according to valgrind [2]) a pointer or reference to a heap-allocated vector works. How can prevent this leak? Also, the
RuntimeError: return_value_policy = move
is misleading becausereference_internal
is used everywhere.Every python shell is started with
python3 -i -c 'from datatest import *; d = get_data()'
vs
but interestingly, first calling d.ptr() and then d.copy()
worksdoes not raise when the interactive shell calls__repr__
but is still broken when doing anything else including calling that method yourself.1:
2:
Using Python 3.5.3 and g++ 6.3.