mesonbuild / meson-python

Meson PEP 517 Python build backend
https://mesonbuild.com/meson-python/
MIT License
118 stars 59 forks source link

Question: Wrapping nanobind / subproject extensions via meson-python #588

Closed WillAyd closed 3 months ago

WillAyd commented 4 months ago

I am trying to integrate some nanobind modules into a meson project. As things stand today, nanobind only supports CMake as a build system, and I am not aware of another project that has converted that to Meson.

As far as I can tell, the best way to structure this is to put the nanobind code that requires CMake into a subproject. So I end up with a structure like this:

project
  /src
    meson.build
  /subprojects
    /nb_libs
      my_ext.cpp

Is there any way at all to have the extensions built from subprojects into the /src build tree? I've tried something like this in meson.build:

cmake = import('cmake')
sub_proj = cmake.subproject('nb_libs')
my_ext_dep = sub_proj.dependency('my_ext')
py.extension_module('my_ext', dependencies: [my_ext_dep])

This gets me the structure I need, but generates a warning that it will be broken in a future version of meson:

WARNING: Build target new_vector.cpython-310-x86_64-linux-gnu has no sources. This was never supposed to be allowed but did because of a bug, support will be removed in a future release of Meson

and doesn't work at runtime I think because the module export function gets mangled in the process

>>> import my_ext
ImportError: dynamic module does not define module export function (PyInit_my_ext)

Another option I've considered is doing a custom_target that moves the subproject dependency output:

custom_target('move_new_vector',
    command : ['mv', '@INPUT@', 'src/my_ext.cpython-310-x86_64-linux-gnu.so'],
    output: 'my_ext.cpython-310-x86_64-linux-gnu.so',
    install_dir : 'src',
    input: vec_my_ext_dep
)

But that does not work because my_ext_dep is an InternalDependency, not a string (maybe meson has a way to convert this). It also seems pretty fragile, so hoping for a better way

eli-schwartz commented 4 months ago

Isn't nanobind just a header-only C++ library?

Edit: sorry, I guess it's a shared support library too, yeah.

rgommers commented 4 months ago

Thanks for the issue @WillAyd. I had a read through the nanobind doc section you linked and its CMake file and linker scripts, and that looks nontrivial to replicate indeed. I'm also really not sure if some of the things it does with linker flags in nanobind-config.cmake are going to hold up. E.g., distro maintainers tend to not like when you override their decisions by doing things like this:

https://github.com/wjakob/nanobind/blob/5752129aee83313e720adadfbd184e9b83fdd67a/cmake/nanobind-config.cmake#L243-L246

And silently not targeting the stable ABI like this may lead to issues, because the build backend won't know:

https://github.com/wjakob/nanobind/blob/5752129aee83313e720adadfbd184e9b83fdd67a/cmake/nanobind-config.cmake#L305-L309

This is done also if nanobind is installed as a shared library from a package manager, which is the recommended method it looks like. It's nothing like pybind11, where all you need is an include directory.

Is there any way at all to have the extensions built from subprojects into the /src build tree?

This is never right AFAIK; Meson is an out-of-tree build system, so all build artifact from the main project and all subprojects go into the build directory. There should be no need for it to be in-tree though? Was the problem that the shared library wasn't found in the build dir, or something else?

WillAyd commented 4 months ago

Thanks @rgommers - I probably am not expressing well enough what I am trying to do. So for instance in pandas we have a lot of "custom" vector code written in cython that sits in the pandas/_libs directory. I am interested in seeing how it would work to replace that Cython generated library with a nanobind generated library without affecting the imports of Python. But with the layout of:

pandas/
  _libs/
    hashtable.pyx
subprojects/
  nb_libs/
    vector_ext.cpp

I would ideally like to replace my from pandas._libs.hashtable import Int64Vector with pandas._libs.vector_ext import Int64Vector, but am not sure how that would be possible

rgommers commented 4 months ago

I would ideally like to replace my from pandas._libs.hashtable import Int64Vector with pandas._libs.vector_ext import Int64Vector, but am not sure how that would be possible

From Python code or Cython code? The former is straightforward, because it only matters what the install directory is (not the build dir, that can be anywhere). You specify that install location directly through the subdir keyword of py.extension_module. Cython can be a little more fiddly sometimes, but still it's possible; in SciPy we have lots of generated Cython code that also ends up in the build dir.

I'm happy to try your Pandas branch on the weekend if it's shareable?

eli-schwartz commented 4 months ago

I had a read through the nanobind doc section you linked and its CMake file and linker scripts, and that looks nontrivial to replicate indeed. I'm also really not sure if some of the things it does with linker flags in nanobind-config.cmake are going to hold up. E.g., distro maintainers tend to not like when you override their decisions by doing things like this:

Honestly, this is just classic cmake. You get tons of power to have a custom function inject wide-ranging changes into the whole project, but at the end of the day all people actually do with it is create a cmake function that runs add_library() with additional flags.

It's trivial to replicate in meson, assuming you want to:

and you are done. Meson already handles visibility for you. Nanobind doesn't default to LTO and neither does meson, but meson provides all the tools you need here.

The most significant factor here is that it looks like you need to compile nanobind's helper library differently depending on whether you are targeting the limited API or not. That should really be modeled by building and installing both. Then you need to know whether you are building in meson with the limited API or not, and choose which one to get with dependency(). You'd want to check the value of python.allow_limited_api before assuming anything.

WillAyd commented 4 months ago

I'm happy to try your Pandas branch on the weekend if it's shareable?

Happy to share, although its in a very initial state at the moment: https://github.com/WillAyd/pandas/tree/khashl-usage

You specify that install location directly through the subdir keyword of py.extension_module.

That's great to know! I think though that I would still have the same limitations as in the OP, i.e. I wouldn't be provided "sources" to the py.extension_module call and the entry symbol to the library might be getting mangled in that process

@eli-schwartz thanks for that feedback. Apologies for a potentially naieve question, but when you mention create a library() for nanobind is that something that would be done in the project I am working on, in a dedicated project that provides the nanobind+meson glue, or further up the chain somewhere?

eli-schwartz commented 4 months ago

It would be done in the subproject for nanobind itself, probably.

WillAyd commented 3 months ago

Looks like I can get a simple example built by setting up subprojects/nanobind.wrap to look like:

[wrap-git]
method = cmake
url = https://github.com/wjakob/nanobind/
revision = v1.9.2
depth = 1

[provide]
nanobind = nanobind_static_dep

And in the meson build file doing:

nanobind_dep = dependency('nanobind')
py.extension_module(
    'new_vector',
    ['new_vector.cpp'],
    dependencies: [nanobind_dep],
    subdir: 'pandas/_libs',
    install: true
)