pybind / cmake_example

Example pybind11 module built with a CMake-based build system
Other
606 stars 215 forks source link

add Matlab GHA workflow (with a struct-definition addition to the basic example) #164

Open slayoo opened 8 months ago

slayoo commented 8 months ago

This PR adds a GHA workflow exemplifying usage of pybind11-built Python package from the Matlab Python interface. The GHA uses the https://github.com/matlab-actions/setup-matlab action to enable access to licensed Matlab.

It shows that:

This issue has been reported at:

rwgk commented 8 months ago

I don't have the permissions to approve this workflow. @henryiii could you please help out? (Could you change the global setting to be like what we have for pybind11?)

slayoo commented 8 months ago

@rwgk, FWIW, here are GHA logs from a PR from the same branch to my fork's master: https://github.com/slayoo/cmake_example/actions/runs/6918406701/job/18820634080

henryiii commented 8 months ago

I don't have access to the settings for this repo, only @wjakob does.

slayoo commented 8 months ago

thank you for approving the CI run, the Matlab failure is now visible in this PR

rwgk commented 8 months ago

We need to find out where this error is produced:

Cannot find class 'py.pybind11_builtins.pybind11_type'.

I looked in the pybind11 sources for simply Cannot (22 matches), but none of those possibly has "find" as the next word.

Almost certainly, that error is produced outside pybind11. Where and why?

rwgk commented 8 months ago

More simple observations:

pybind11_builtins (case sensitive search) only appears in include/pybind11/detail/class.h, this line in 3 locations:

    setattr((PyObject *) type, "__module__", str("pybind11_builtins"));

But that module does not actually exist (not a surprise to me, but I never thought about the background). Simple experiment to show that:

test_buffers.py:10: in <module>
    import pybind11_builtins
E   ModuleNotFoundError: No module named 'pybind11_builtins'

Which brings me back to my previous comment: Where and why is the error message produced?

wjakob commented 8 months ago

@rwgk, I added you as a maintainer.

rwgk commented 8 months ago

@rwgk, I added you as a maintainer.

Thanks!

slayoo commented 8 months ago

We need to find out where this error is produced:

Cannot find class 'py.pybind11_builtins.pybind11_type'.

IIUC, this would be within Matlab code base? @mcafaro, @acampbel, @mw-hrastega, could you confirm and give us any hints here? (context: pybind11-generated Python bindings err when used using the Matlab Python interface by producing such error while instantiating classes, what otherwise works OK outside of Matlab; in this PR, we use the matlab-actions to depict the problem on CI). Thanks!

rwgk commented 8 months ago

FWIW, I just also looked in the CPython 3.8 sources for "Cannot find", which has 15 matches, but none of those could possibly have "class" as the next word.

IIUC, this would be within Matlab code base?

Yeah, that's what I'm thinking, too.

rwgk commented 8 months ago

Crazy idea: what happens if you add pybind11_builtins.py on your PYTHONPATH with pybind11_type = type?

I don't really expect that to work, but maybe it produces an error that gives us more clues?

slayoo commented 8 months ago

Crazy idea: what happens if you add pybind11_builtins.py on your PYTHONPATH with pybind11_type = type?

I don't really expect that to work, but maybe it produces an error that gives us more clues?

Here's an attempt to inject it through the GHA script: https://github.com/pybind/cmake_example/pull/164/commits/a4d99d88818842c284f7962eee3115f95269150c + 5bebb1f (waits for CI approval to run). Thanks!

slayoo commented 8 months ago

@rwgk WOW, it helped!!!

   Python module with properties:

       AStruct: [1x1 py.pybind11_builtins.pybind11_type]
      subtract: [1x1 py.builtin_function_or_method]
           add: [1x1 py.builtin_function_or_method]

      <module 'cmake_example' from '/home/runner/work/cmake_example/cmake_example/cmake_example.cpython-38-x86_64-linux-gnu.so'>

  AStruct = 

    Python pybind11_type with no properties.

      <class 'cmake_example.AStruct'>

  ans = 

    Python AStruct with no properties.

      <cmake_example.AStruct object at 0x7f94e5d46b70>
rwgk commented 8 months ago

WOW, it helped!!!

Amazing!

I'm wondering: Could/should we create a pybind11_builtins module and populate it with the actual pybind11_type?

But there is a big hitch: Potentially there is a mix of extensions built with different pybind11 versions and/or different compilers. That could become very confusing.

I think it'll be best if we get feedback from Matlab developers first.

slayoo commented 7 months ago

WOW, it helped!!!

Amazing!

I'm wondering: Could/should we create a pybind11_builtins module and populate it with the actual pybind11_type?

But there is a big hitch: Potentially there is a mix of extensions built with different pybind11 versions and/or different compilers. That could become very confusing.

I think it'll be best if we get feedback from Matlab developers first.

@rwgk, how should we best act here? The ingenious workaround is likely sought by many users, but how to best "package" it? Also, perhaps including the Matlab "action" in pybind11 CI will also help prevent future incompatibilities.

rwgk commented 7 months ago

IMO my workaround is a hack, just enough to prove that Matlab implicitly introduces a requirement that nobody else has (that the metaclass is importable). Why that requirement exists I don't know. If they give us a convincing reason, maybe we can adjust. But based on what I know at the moment, I'd say it'll be far better if Matlab is changed to not introduce that requirement. Maintaining a hack here will be a drag.

StauntonXLIV commented 7 months ago

I'm trying to instantiate a a pybind11 class (from a perfectly working python module) into Matlab and getting the same error. The However, I'm on Windows... What changes would I need to make, if any, to make this hack work? Python in Windows doesn't use PYTHONPATH.

rwgk commented 7 months ago

from a perfectly working python module

Assuming that perfectly working python module lives in a directory, simply put pybind11_builtins.py in there as well. If python finds your existing module there already, it should find pybind11_builtins.py there as well.

Another idea is this (tested only locally; I added that code more-or-less randomly to an existing test):

diff --git a/tests/test_async.cpp b/tests/test_async.cpp
index a5d72246..6e0c1616 100644
--- a/tests/test_async.cpp
+++ b/tests/test_async.cpp
@@ -22,4 +22,6 @@ TEST_SUBMODULE(async_module, m) {
             f.attr("set_result")(5);
             return f.attr("__await__")();
         });
+
+    m.def("module_new", [](const char *name) { return py::reinterpret_steal<py::object>(PyModule_New(name)); });
 }
diff --git a/tests/test_async.py b/tests/test_async.py
index 83a4c503..f3db4385 100644
--- a/tests/test_async.py
+++ b/tests/test_async.py
@@ -22,3 +22,11 @@ def test_await(event_loop):
 def test_await_missing(event_loop):
     with pytest.raises(TypeError):
         event_loop.run_until_complete(get_await_result(m.DoesNotSupportAsync()))
+
+
+def test_module_new():
+    virtual_pybind11_builtins = m.module_new("pybind11_builtins")
+    virtual_pybind11_builtins.pybind11_type = type
+    import sys
+    sys.modules["pybind11_builtins"] = virtual_pybind11_builtins
+    import pybind11_builtins
StauntonXLIV commented 7 months ago

from a perfectly working python module

Assuming that perfectly working python module lives in a directory, simply put pybind11_builtins.py in there as well. If python finds your existing module there already, it should find pybind11_builtins.py there as well.

Another idea is this (tested only locally; I added that code more-or-less randomly to an existing test):

Thank you so much! I couldn't get the first suggestion to work, maybe because that module uses a .pyd referencing a precompiled .dll...

What worked for me was to move the pybind11_builtins.py file into my venv Lib directory, and manually importing it into my Matlab pyenv. After that, importing my originally intended module worked perfectly.

rwgk commented 5 months ago

It crossed my mind, another option is to use py::metaclass((PyObject *) &PyType_Type), like here:

    py::class_<YourType>(m, "YourType", py::metaclass((PyObject *) &PyType_Type))

The only potential downside is that static properties behave differently.

slayoo commented 5 months ago

@rwgk thanks! The above commit e1a73a4 is intended to try if it works. CI run requires maintainer approval.

rwgk commented 5 months ago

It seems to work. I created pybind/pybind11#5015 to make using this trick a little more obvious and legit-looking.

slayoo commented 5 months ago

Thanks @rwgk !

slayoo commented 5 months ago

@rwgk We're trying to include this "trick" in the PyPartMC project where the original issue was discovered. Compilation is fine, instantiation of the objects work, and Python tests pass. In the Matlab tests, instantiation goes fine (without the pybind11_builtins.py file), but trying to pass an object as an argument to a function fails with:

  Error using command_03eca20b_7d9b_4dfc_8ae3_5abbfa4e5128
  Python Error: TypeError: dist_sample(): incompatible function arguments. The
  following argument types are supported:
      1. (self: _PyPartMC.AeroState, AeroDist: AeroDist, sample_prop: float =
      1.0, create_time: float = 0.0, allow_doubling: bool = True, allow_halving:
      bool = True) -> int

  Invoked with: <_PyPartMC.AeroDist object at 0x7f6887adaab0>

So, the dist_sample() function expects an AeroDist object, but when it gets it is reported to be of an incompatible type. The same logic works when executed from Python without the Matlab bridge layer.

Any hints? Thanks

rwgk commented 5 months ago

The same logic works when executed from Python without the Matlab bridge layer.

Any hints? Thanks

Did it work before, with the default (custom) pybind11 metaclass and the alternative pybind11_type trick?

I'm interested in finding out why you see the failure. Could it be reproduced here, in your matlab.yml job?

slayoo commented 5 months ago

Yes, it did work with the pybind11_type trick. I'll try to reproduce it here...

rwgk commented 5 months ago

The matlab job is still passing here.

Totally guessing: do you have multiple extension modules in your original failure case (Python Error: TypeError: dist_sample(): incompatible function arguments.)?

rwgk commented 5 months ago

Generally, if guessing a reproducer doesn't work straightaway, I give up very quickly and turn to reducing (or debugging) the original problem instead.

slayoo commented 5 months ago

debugging is tricky as I do not have access to Matlab beyond the Github Actions CI :) (the above commit checks if the use of py::arg could be the reason)

rwgk commented 5 months ago

I was thinking of adding print statements in the pybind11 sources to see where/why it's rejecting the argument. We wouldn't have to get into matlab. — I can help inserting prints; but we need a failure to start with.

YannickJadoul commented 5 months ago

@slayoo, just a comment as I saw an email passing by. For what it's worth: importing pybind11 modules used to work in older versions. I just tried and I can perfectly import/use my own library, which is currently still built with pybind11 v2.6.2 (with a couple of patches to support newer Python versions. I am 100% sure pybind11_builtins.pybind11_object was already the base class of pybind11 classes in this version. This is with MATLAB 2023b on Linux:

>> class(py.parselmouth.Sound("~/the_north_wind_and_the_sun.wav"))

ans =

    'py.parselmouth.Sound'

So if you want to better understand, you might also want to bisect a few pybind11 releases and track down where this went wrong.