pybind / pybind11

Seamless operability between C++11 and Python
https://pybind11.readthedocs.io/
Other
15.1k stars 2.05k forks source link

bugfix: KeysView/ValuesView/ItemsView using Python types. Fix #4529 #4983

Closed hczhai closed 5 months ago

hczhai commented 6 months ago

Description

This PR is an alternative to PR #4972. Fixes issue #4529 and #3986. Compatible to #4888.

Here, KeysView/ValuesView/ItemsView will have python style typing strings, like KeysView[int] (see #3986). KeysView[int] will only be registered once for all relevant KeyType = int16_t, int32_t, int64_t, ... which solves #4529. A mix of C++ and Python types can also be supported.

Note: To achieve the goal of this PR I cannot directly use detail::make_caster<KeyType>::name for the Python type name. Consider the case when KeyType = std::pair<int, std::vector<int>> which is a mix of Python and C++ types. detail::make_caster<KeyType>::name for this case will be tuple[int, %]. The % appears, since detail::make_caster<KeyType>::name is a constexpr, which cannot recognize the binding name for std::vector<int>. PR #4353 uses if (key_type_name == "%") to get rid of "%" and fall back to the C++ type name, but unfortunately this does not work for the nested case tuple[int, %]. With this PR, I introduced detail::type_mapper<KeyType>::py_name() to construct the correct type name.

Suggested changelog entry:

Fix type name clash in ``KeysView`` and ``ValuesView`` when using multiple ``bind_map`` with different integer key types
rwgk commented 6 months ago

Wow, thanks!

It's very late in my current timezone, I can only look later.

I tried to start global testing for this PR as well, hoping for overnight results, but I didn't get very far, initial testing fails with this error (one of many):

libc++abi: terminating due to uncaught exception of type std::__u::system_error: ODR VIOLATION DETECTED:

pybind11::detail::type_caster<std::__u::map<int, int, std::__u::less<int>, std::__u::allocator<std::__u::pair<int const, int>>>>:

SourceLocation1="third_party/pybind11/include/pybind11/detail/../detail/type_caster_base.h:1110"
SourceLocation2="third_party/pybind11/include/pybind11/stl.h:281"

NOTE: The line numbers are off, compared to pybind11 master.

This error is generated by the feature merged with smart_holder PR #4022. (Google production uses a branch of pybind11 that includes that feature.)

It's saying that there is one (or more) translation unit(s) in which type_caster_base is used, and one (or more) translation unit(s) in which the stl.h map_caster is used for the same T.

That's all I know at the moment. I/we need to find out how this PR is leading to that situation.

hczhai commented 6 months ago

@rwgk Thanks for the testing and the quick feedback.

It's saying that there is one (or more) translation unit(s) in which type_caster_base is used, and one (or more) translation unit(s) in which the stl.h map_caster is used for the same T.

So this seems to be a case when it is a mix use of "stl.h" and "stl_bind.h"?

This error is generated by the feature merged with smart_holder PR https://github.com/pybind/pybind11/pull/4022. (Google production uses a branch of pybind11 that includes that feature.)

I tried merging this PR with the smart_holder branch, and it passed all tests, so they are compatible according to the tests there. Since your line number (also) does not match the line number in smart_holder branch, maybe you can point me to a more relevant branch?

Considering the error message you provided only has something to do with type_caster, I deleted all (explicit) usage of detail::make_caster in this PR (without affecting the functionality). Please see if the problem can now be fixed.

If possible, please also let me know whether PR #4972 can pass your tests.

rwgk commented 6 months ago

I tried merging this PR with the smart_holder branch, and it passed all tests

The feature may not be ON by default, it depends on how exactly you run the tests. — But it might be best not to spend time reproducing my observations; better to stay focused on pybind11 master.

The problem is probably (I didn't verify yet) because of the rather unusual way the pybind11 unit tests are organized. Most tests are implemented as submodules and statically linked together into pybind11_tests, e.g.:

from pybind11_tests import stl_binders as m

I'm thinking the ODR guard errors will disappear if we change test_stl_binders.cpp to be in an isolated module, so that the linker visibility protections apply (optional: look for "visibility" under https://pybind11.readthedocs.io/en/stable/faq.html).

We're just lucky that the existing tests are such that the ODR guard doesn't trigger.

We have 4 tests already that are in isolated modules (cross_module_gil_utils, cross_module_interleaved_error_already_set, eigen_tensor_avoid_stl_array, pybind11_cross_module_tests), so it's probably pretty easy to change the stl_binders test to be in an isolated module, too. ... I'll give that a try.

rwgk commented 6 months ago

Considering the error message you provided only has something to do with type_caster, I deleted all (explicit) usage of detail::make_caster in this PR (without affecting the functionality). Please see if the problem can now be fixed.

Oh, wait, I overlooked that before.

rwgk commented 6 months ago

Considering the error message you provided only has something to do with type_caster, I deleted all (explicit) usage of detail::make_caster in this PR (without affecting the functionality). Please see if the problem can now be fixed.

Oh, wait, I overlooked that before.

I tried, it doesn't help.

Let me try moving the stl_binders test to an isolated module.

rwgk commented 6 months ago

Previously I wrote:

I'm thinking the ODR guard errors will disappear if we change test_stl_binders.cpp to be in an isolated module

Confirmed, by applying #4984 locally. (The GitHub Actions failures under #4984 are yet another known problem; see PR description there.)

I'm now in a position to run Google global testing with this PR. That will probably take about half a day at least.

rwgk commented 6 months ago

The reduced diff for the case identified previously is below.

Looks like we need special handling for %.

-class ItemsView[int, object]:
+class ItemsView[int, %]:

-class ItemsView[str, object]:
+class ItemsView[str, %]:

 class KeysView[int]:
-    @overload
-    def __contains__(self, arg0: int) -> bool: ...
-    @overload
     def __contains__(self, arg0: object) -> bool: ...

 class KeysView[str]:
-    @overload
-    def __contains__(self, arg0: str) -> bool: ...
-    @overload
     def __contains__(self, arg0: object) -> bool: ...

-    def items(self) -> ItemsView[int,object]: ...
+    def items(self, *args, **kwargs) -> Any: ...

-    def values(self) -> ValuesView[object]: ...
+    def values(self, *args, **kwargs) -> Any: ...

-    def items(self) -> ItemsView[str,object]: ...
+    def items(self, *args, **kwargs) -> Any: ...

-    def values(self) -> ValuesView[object]: ...
+    def values(self, *args, **kwargs) -> Any: ...

-class ValuesView[object]:
+class ValuesView[%]:
rwgk commented 6 months ago

@hczhai @aimir @sizmailov

I'm still waiting for the global testing results for this PR, but maybe it's good to share what's going through my mind (inconclusive):

The type_mapper feature introduced in this PR adds a pretty significant amount of code and complexity. — That's a cost.

An already existing cost: class_ wrappers for each K, V, K, V.

What's the benefit?

Really, all we're getting is [K], [V] or [K, V] annotations.

Is that worth it?

Assume we decide the cost / benefit is too high by some measure.

We could just class_-wrap keys_view, values_view, items_view (non-templated!) once each.

Then something like

template <typename CppType>
struct keys_view_cpp : keys_view { /* ... */ };

to override the member functions, and then return a base-class pointer here:

    cl.def("keys", [](Map& m) -> std::unique_ptr<keys_view> { /* ... */ });

In terms of code complexity this would be much cheaper.

Would it be a big loss to not have the [K], [V] or [K, V] annotations?

hczhai commented 6 months ago

@rwgk Thanks for these tests and comments. They are very helpful.

Looks like we need special handling for %.

This is related to the missing handle_type_name<object> etc. which is supposed to be handled in your PR https://github.com/pybind/pybind11/pull/4888/files#diff-b8e97eea5f84a84248db1bc72afaccfd8ba28cad023917d52bdc6bbc3669cd74R877-R880 .

I added these 4 lines which should fix the test cases you mentioned.

hczhai commented 6 months ago

@rwgk

The type_mapper feature introduced in this PR adds a pretty significant amount of code and complexity. — That's a cost.

I agree that this PR is probably too complicated for the purpose. Especially when compared with: https://github.com/pybind/pybind11/pull/2244#issuecomment-641945474

We could just class_-wrap keys_view, values_view, items_view (non-templated!) once each.

I think this is a good idea. It can solve #4529 with the smallest cost. See #4985

rwgk commented 6 months ago

For completeness:

I added these 4 lines which should fix the test cases you mentioned.

After applying commit c43713f64a8c414ec73fc23838e5435e411b72aa, the only diff is this:

 class KeysView[int]:
-    @overload
-    def __contains__(self, arg0: int) -> bool: ...
-    @overload
     def __contains__(self, arg0: object) -> bool: ...

 class KeysView[str]:
-    @overload
-    def __contains__(self, arg0: str) -> bool: ...
-    @overload
     def __contains__(self, arg0: object) -> bool: ...
hczhai commented 6 months ago

After applying commit https://github.com/pybind/pybind11/commit/c43713f64a8c414ec73fc23838e5435e411b72aa, the only diff is this.

The overloading of __contains__ is an implementation detail introduced in PR #4353. Having only one version of __contains__ seems to be more pythonic (aligned with the goal of this PR). So if we merge this PR, then that diff should be applied to the test, I think. Same thing should happen in PR #4985.

rwgk commented 5 months ago

Closing this after #4985 was merged.