sizmailov / pybind11-stubgen

Generate stubs for python modules
Other
218 stars 44 forks source link

Using `KeysView`/`ValuesView`/`ItemsView` across sub modules can result in "can't find/import" error #190

Open bluenote10 opened 7 months ago

bluenote10 commented 7 months ago

Consider the following example:

#include <functional>
#include <unordered_map>

#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include <pybind11/stl_bind.h>

namespace py = pybind11;

struct MyKeyA
{
  bool operator==(const MyKeyA&) const = default;
};

struct MyKeyB
{
  bool operator==(const MyKeyB&) const = default;
};

template <>
struct std::hash<MyKeyA>
{
  size_t operator()(const MyKeyA& obj) const
  {
    return 0;  // omitted.
  }
};

template <>
struct std::hash<MyKeyB>
{
  size_t operator()(const MyKeyB& obj) const
  {
    return 0;  // omitted.
  }
};

PYBIND11_MODULE(my_native_module, m)
{
  py::module_ module_a = m.def_submodule("module_a");
  py::class_<MyKeyA>(module_a, "MyKeyA");
  py::bind_map<std::unordered_map<MyKeyA, int>>(module_a, "MapAToInt");  // (1)

  py::module_ module_b = m.def_submodule("module_b");
  py::class_<MyKeyB>(module_b, "MyKeyB");
  py::bind_map<std::unordered_map<MyKeyA, bool>>(module_b, "MapAToBool");  // (2)
  py::bind_map<std::unordered_map<MyKeyB, bool>>(module_b, "MapBToBool");
}

The pattern here is:

Trying to generate stubs for the module fails with:

pybind11_stubgen - [  ERROR] In my_native_module.module_b.MapAToBool.keys : Can't find/import 'my_native_module.module_a.KeysView'
pybind11_stubgen - [   INFO] Terminating due to previous errors

The issue seems to be triggered by the co-existence of lines (1) and (2), i.e., if KeysA gets used in module_a in a map, it somehow "moves the KeysView[...] type over" into that module. By commenting out line (1), the KeysView ends up in the other sub-module which works. Conversely, commenting out line (2) also works, because then module_b is not trying to reference the invalid KeysView path.

For comparison, when commenting out line (1) the symbols in submodule a/b are divided as (trimmed outputs of dir(my_native_module.module_X)):

# Symbols in module_a:
MyKeyA

# Symbols in module_b:
ItemsView[my_native_module.module_a.MyKeyA, bool]
ItemsView[my_native_module.module_b.MyKeyB, bool]
KeysView[my_native_module.module_a.MyKeyA]
KeysView[my_native_module.module_b.MyKeyB]
MapAToBool
MapBToBool
MyKeyB
ValuesView[bool]

Note that all the KeysView are local to b. Commenting in line (1) "moves" the KeysView into module_a leading to the problematic module path:

# Symbols in module_a:
ItemsView[my_native_module.module_a.MyKeyA, int]
KeysView[my_native_module.module_a.MyKeyA]
MapAToInt
MyKeyA
ValuesView[int]

# Symbols in module_b:
ItemsView[my_native_module.module_a.MyKeyA, bool]
ItemsView[my_native_module.module_b.MyKeyB, bool]
KeysView[my_native_module.module_b.MyKeyB]
MapAToBool
MapBToBool
MyKeyB
ValuesView[bool]
sizmailov commented 7 months ago

The error message Can't find/import 'my_native_module.module_a.KeysView' is factually correct. The my_native_module.module_a does not have KeyView, but has KeysView[my_native_module.module_a.MyKeyA] (sic!).

The root problem is that pybind11 generates classes with brackets in the names, for example:

ItemsView[my_native_module.module_a.MyKeyA, bool]

I choose to skip such classes during discovery to avoid generating stubs with invalid syntax, e.g.:

class ItemsView[my_native_module.module_a.MyKeyA, bool]:
    ...

You don't usually see the problem because:

  1. pybind11-stubgen strips the current module path from the annotations my_native_module.module_a.KeysView[my_native_module.module_a.MyKeyA] becomes KeysView[my_native_module.module_a.MyKeyA]
  2. KeysViews is replaced with typing.KeyViews

If you use the type from another module first step fails and you see a strange error message.

What can be done about them? I think the specific module-level annotations like my_native_module.module_a.KeysView do not bring any value compared to typing.KeyView, so they could be universally replaced.

The only problem is robustly detecting such classes in references and avoiding disturbing error messages.

bluenote10 commented 7 months ago

What can be done about them? I think the specific module-level annotations like my_native_module.module_a.KeysView do not bring any value compared to typing.KeyView, so they could be universally replaced.

Yes, this is in fact what we have been doing so far in a manual post-processing when using the mypy stub generator, and this heuristic works reasonably well so far.

Of course it means that the names KeysView, ValuesView, and ItemsView are more or less reserved names that get special treatment. If a binding wants to expose a user-defined class that shares that name, things could fall apart. However, this could be easily worked around by simply avoided that particular name, i.e., even if a C++ type happens to be called KeysView it could be exported under a different name like "KeysViewCpp" or so. Considering that these three types already have a meaning in the Python type system that might be a sensible thing to do anyway to avoid confusion.