pybind / pybind11

Seamless operability between C++11 and Python
https://pybind11.readthedocs.io/
Other
15.57k stars 2.09k forks source link

fix(types): render typed iterators in `make_*iterator` doc #4876

Closed sizmailov closed 11 months ago

sizmailov commented 12 months ago

Description

Render typed iterators for make_iterator, make_key_iterator, make_value_iterator.

Example:

from pybind11_tests import sequences_and_iterators as m
print(m.IntPairs.nonref.__doc__.strip())

### Output ###

# current master:
nonref(self: pybind11_tests.sequences_and_iterators.IntPairs) -> Iterator
# this PR:
nonref(self: pybind11_tests.sequences_and_iterators.IntPairs) -> Iterator[tuple[int, int]]

The benchmark shows no measurable differences in runtime performance.

bench.py

```python import timeit from pybind11_tests import sequences_and_iterators as m pairs = m.IntPairs([]) repeat = 1000000 print(timeit.timeit(lambda: pairs.nonref(), number=repeat)) print(timeit.timeit(lambda: pairs.nonref_keys(), number=repeat)) print(timeit.timeit(lambda: pairs.nonref_values(), number=repeat)) print(timeit.timeit(lambda: pairs.simple_iterator(), number=repeat)) print(timeit.timeit(lambda: pairs.simple_keys(), number=repeat)) print(timeit.timeit(lambda: pairs.simple_values(), number=repeat)) ```

The module size is increased by 8192 bytes (~0.2%):

du -b pybind11_tests.cpython-310-x86_64-linux-gnu.so
# current master:
3605824 ./pybind11_tests.cpython-310-x86_64-linux-gnu.so
# this PR
3614016 ./pybind11_tests.cpython-310-x86_64-linux-gnu.so

Previous attempts to address the same issue:

2244

2371

Suggested changelog entry:

Render typed iterators for ``make_iterator``, ``make_key_iterator``, ``make_value_iterator``
rwgk commented 11 months ago

FYI: I just initiated Google-global testing for this PR.

Mainly to see how many stubgen-generated .pyi files need to be updated.

BTW: When I deployed PRs #4831 & #4833 (in one combined update) I had to update 50 .pyi around the codebase, which was very unusual, but also not surprising (because #4833 changed List to list etc.).

rwgk commented 11 months ago

FYI: I just initiated Google-global testing for this PR.

I will have to update 2 .pyi files. I inspected the auto-generated changes and they look like a nice cleanup, e.g.

-    def keys(self) -> Iterator: ...
-    def values(self) -> Iterator: ...
+    def keys(self) -> Iterator[str]: ...
+    def values(self, *args, **kwargs) -> Any: ...
-    def __iter__(self) -> Iterator: ...
+    def __iter__(self) -> Any: ...

I'll go ahead with merging this PR.

rwgk commented 11 months ago

Note for completeness:

The

🐍 3.9-dbg (deadsnakes) • Valgrind • x64

CI failure is unrelated. First observed 2023-10-16 (yesterday).

sizmailov commented 11 months ago

@rwgk Thank you!

TBH I expected more files to be affected. Probably Google's code base does not expose many iterators.

The diff doesn't look like a 100% improvement to me.

The following stub got worse, but it's an issue of binding code, not pybind11.

-    def values(self) -> Iterator: ...
+    def values(self, *args, **kwargs) -> Any: ...

I would speculate that the snippet was generated by an older version of pybind11-stubgen and actual docstring contains raw C++ type, e.g.

def values(self) -> Iterator[std::complex<double>]

I have no good guess what happened to __iter__.

rwgk commented 11 months ago

they look like a nice cleanup, e.g.

Oof ... I was looking at it wrong, thanks for pointing out that this went the wrong way.

According to metadata that are usually very reliable we're using mypy at this commit:

https://github.com/python/mypy/commit/de4f2ad99c88b885677247534dcec335621ade4d

That's from Aug 11, 2023, just 2 months ago.

This is what the __doc__ looks like (manually simplified):

__iter__(self: x.y.z.UserTypeA) -> Iterator[tuple[str, cpp_namespace::UserTypeB]]

Could it be that :: in the name confuse stubgen?

sizmailov commented 11 months ago

The C++ signatures in docstrings should be fixed on the binding code side: https://pybind11.readthedocs.io/en/latest/advanced/misc.html#avoiding-c-types-in-docstrings

I don't how mypy.stubgen handles C++ signatures, probably the same way as pybind11-stubgen.

Yes, :: and angle brackets should make any stubgen unhappy.

rwgk commented 11 months ago

The C++ signatures in docstrings should be fixed on the binding code side:

Thanks for the pointer. I'll relay that to the owners of the code.

rwgk commented 11 months ago

Thanks for the pointer. I'll relay that to the owners of the code.

It's coming straight back at me like a boomerang:

  1. https://github.com/pybind/pybind11_protobuf/blob/6c081a7581e34d2f46cb9a6484eb88b5ba62f03f/pybind11_protobuf/proto_caster_impl.h#L193

  2. https://github.com/pybind/pybind11/blob/74439a64a22850f16323f1e1e09ace2a909b0b61/include/pybind11/pybind11.h#L494-L496

What could we do there (in pybind11/pybind11.h)?

sizmailov commented 11 months ago

What pybind11 does in the cited snippet above makes perfect sense. I don't think we should change that.

I believe it should be fixed on the pybind11_protobuf side.

If we are talking about fixing .pyi stubs files only and keeping C++ in docstrings, replacing C++ types can be done at the stubs parsing/post-processing stage.

rwgk commented 11 months ago

@sizmailov wrote:

I believe it should be fixed on the pybind11_protobuf side.

That's not possible: the Python name is not known at compile time (type_caster name is a constexpr).

After turning this over in my mind, discarding several ideas, and asking my team mates for advice, I landed on this:

https://github.com/pybind/pybind11/pull/4888

It's probably best to continue the discussion there.

Skylion007 commented 8 months ago

We should have added the test from this issue: https://github.com/pybind/pybind11/issues/4100 Can we verify we did not accidentally reintroduce this problem?