pybind / pybind11

Seamless operability between C++11 and Python
https://pybind11.readthedocs.io/
Other
15.11k stars 2.05k forks source link

[smart_holder] Bug fix: `std::unique_ptr` deleter needs to be copied. #4850

Closed rwgk closed 8 months ago

rwgk commented 9 months ago

Description

IMPORTANT: Quite possibly follow-on work is needed. See comments for details, in particular https://github.com/pybind/pybind11/pull/4850#issuecomment-1788806760 below.

Root cause for this bug: Firmly held wrong belief that std::unique_ptr does not store a deleter.

But here it is:

Bug detected via a few failing tests with PyCLIF-pybind11 (out of many millions).

Suggested changelog entry:

rwgk commented 8 months ago

Hi @iwanders,

I'm replying here to your message under https://github.com/rwgk/pybind11/pull/26.

When I saw your message I decided it's best to first transfer a simple existing internal-only test to a new test_class_sh_unique_ptr_custom_deleter here, so that you see everything that exists already:

https://github.com/pybind/pybind11/pull/4850/commits/c684f6cec00d5a8536f070883b6bb326fefc9bae

I was thinking that we probably need more tests in tests/pure_cpp/smart_holder_poc_test.cpp, and expand test_class_sh_unique_ptr_custom_deleter. But I'm very uncertain myself what & how exactly.

Things that come to mind as maybe relevant:

It's not obvious, but some code in smart_holder_poc.h turned out to not be usable like that from smart_holder_type_casters.h (when I worked on it originally a couple years ago). It's only used by tests/pure_cpp/smart_holder_poc_test.cpp. One of the things I always wanted to do but never got to: move all code that is not used from smart_holder_type_casters.h to the tests/pure_cpp/ directory, so that only the production code is left in smart_holder_poc.h. The test-only code in smart_holder_poc.h is probably more distracting than helpful for your purposes.

I was thinking of the smart_holder_poc.h code under this PR as ~complete: it works for Google-internal global testing and solved in-the-wild test failures (17 tests, probably 3-5 unique use cases). Therefore I'm surprised that you're adding code in smart_holder_poc.h.

What do you think about this approach to collaborating:

iwanders commented 8 months ago

I was thinking of the smart_holder_poc.h code under this PR as ~complete: it works for Google-internal global testing and solved in-the-wild test failures (17 tests, probably 3-5 unique use cases). Therefore I'm surprised that you're adding code in smart_holder_poc.h.

Hmm, I may be missing something, but I was assuming a custom deleter should survive a roundtrip from unique_ptr -> smart_holder -> unique_ptr.

Currently it doesn't:

Github won't let me comment on the appropriate line, but I feel this current PR only fixes the bug in one direction (going into smart holder), not in the direction of going out of the smart holder. This test shows the problem, it will fail with this branch and unmodified code, which is why I was modifying the smart holder file.

Edit;

Could you help by reviewing this PR? — With the idea that we want to ensure this PR by itself is healthy, but we don't want to add much.

Went through it, overall it looks fine to me, my main concern is the above statement that the deleter is only handled when going into the smart holder, not when being converted back to a unique pointer.

What do you think about this approach to collaborating: [...] Then you could expand the test coverage more systematically (I'd have suggestions) in a follow-on PR

Sure, I can't commit a lot of time, but happy to help over the next week or so.

rwgk commented 8 months ago

@iwanders I found a simple trick to make it obvious what is and isn't used in production: 4d9d722215cce2a5aabc8a894208dc5c1fdf2182

I need to do something else right now, but will come back here to similarly ifdef-out all smart_holder_poc.h code that is not actually used in production code.

rwgk commented 8 months ago

Side remark:

Something weird is happening with this test:

..\..\tests\test_type_caster_odr_guard_1.py ..FF                         [ 95%]

It seems to affect only windows-2022 builds. Un/Related? I need to find out.

(There is one test_iostream flake mixed in.)

rwgk commented 8 months ago

Speaking of as_unique_ptr, another random thought; that method could be marked &&, so std::unique_ptr<T, D> as_unique_ptr() &&, that ensures it can only be called on an r-value smart_holder, which makes it clear that the smart_holder object after the method call is in an indeterminate state.

Hm ... IIUC that's not in line with my thinking behind the code. I wanted to leave the smart_holder in an well-defined state at all times. I believe it has to, because we have to assume it lives in a Python object until that is destroyed.

Your very good round-trip question seems much more important in the context of smart_holder_type_casters.h: is that actually working as intended?

I'm thinking we need something like these:

https://github.com/pybind/pybind11/blob/e955753c1b34c64f39907b8459d6e0d383e1ca12/tests/test_class_sh_basic.cpp#L58-L65

Only more sophisticated to establish firmly that the deleter is copied properly.

rwgk commented 8 months ago

I think this PR is in good shape for merging now, but I'm mystified by the persistent type_caster_odr test failures with windows-2022. I need to get to the bottom of those first.

I'll use that time to rerun Google global testing for this PR, although the only change is the std::move / std::forward replacement.

iwanders commented 8 months ago

Hm ... IIUC that's not in line with my thinking behind the code. I wanted to leave the smart_holder in an well-defined state at all times. I believe it has to, because we have to assume it lives in a Python object until that is destroyed.

Ah, sorry, I worded this ambiguous. You're right it will be in a well defined state, I was mostly concerned about the semantics, to me the as_unique_ptr consumes the smart holder and it would make sense to me to require an rvalue to allow that. On second look I see is_populated is never set to false, except through the default constructor, it's not set to false by as_unique_ptr, but again, because it's only used by the unit tests lets leave it ;)

Only more sophisticated to establish firmly that the deleter is copied properly.

Hmm, still trying to wrap my head around that file as an outsider xD. Easiest way I'd test it is by making a destructor that has a side effect (sets a boolean on call), then perform the round trip and ensure the deleter is only called after the final value goes out of scope.

rwgk commented 8 months ago

Easiest way I'd test it is by making a destructor that has a side effect (sets a boolean on call), then perform the round trip and ensure the deleter is only called after the final value goes out of scope.

That sounds good!

I recommend waiting until this PR is merged, so that we can collaborate more easily. I'll try to get to the bottom of the type_caster_odr_guard failures with highest priority.

rwgk commented 8 months ago

Certainly, the windows-2022 failures are due to changes in the environment, unrelated to this PR:

That ran with smart_holder head, without any code changes.

6 windows-2022 failures, 5 are type_caster_odr_guard failures, one test_iostream flake (which is very common).

I'll revert my latest commit here, then merge. I'll work on dealing with the type_caster_odr_guard failures separately.

rwgk commented 8 months ago

For easy future reference, global testing internal ID: OCL:566559299:BASE:578853944:1698937167027:61af92f8 (passed)

rwgk commented 8 months ago

@iwanders FYI I'll merge from master into smart_holder, too. Should be done in an hour or so. Doesn't change much I guess, just so you're not surprised.

rwgk commented 8 months ago

FYI I'll merge from master into smart_holder

Done.

iwanders commented 8 months ago

@rwgk , thanks, I haven't had time to look at it yet, but wanted to let you know I'll try to make that deleter roundtrip unit test over the weekend. I have to take some time to read up on how the smart holder actually works and how it interacts with unique pointers or deleters first :)

rwgk commented 8 months ago

read up on how the smart holder actually works

I'd be happy to meet and go through the code together. If you're interested, please get in touch via email (in git log). (I looked but couldn't find your direct email.)