Open WillAyd opened 2 weeks ago
IIRC releases.json has to be in alphabetical order, nanobind is right before nanoarrow but should instead be right after.
That makes sense. Looks like some of the build failures have to do with the robin-map depdendency version as well - may need to update that as a precursor
@eli-schwartz @WillAyd : If I understand correctly, the goal of this PR is to build using pure Meson without depending on CMake? This would be awesome, but the recipe here looks somehow too easy to me.
Please take a look at some of the steps explained in the comment at the top of https://github.com/wjakob/nanobind/blob/master/src/nb_combined.cpp. And this is the "ground truth" reference for the extension build process: https://github.com/wjakob/nanobind/blob/master/cmake/nanobind-config.cmake
For example, does this wrapper
-fno-strict-aliasing
) in the static library part only?NB_DOMAIN
parameter? This is in general something that almost every extension should do to avoid cross-contamination.Quite a lot of work has gone into the CMake interface produce nice output—I would be concerned about alternatives that produce subpar extensions by missing some of these steps. A good recipe that helped to reproduce the behavior in the Bazel build system (https://github.com/nicholasjng/nanobind-bazel) has been to run nanobind's CMake build sytem on a simple extension in verbose mode and then try to exactly reproduce all of the compiler and linker invocations with their flags on macOS+Windows+Linux (some subtle differences there as well).
I'm happy to provide further detail on anything with the caveat that I'll be traveling in July.
For example, does this wrapper
- Avoid linking against a specific libpython.so/dylib? Basically, we don't want to bake a libpython path into the compiled shared library—the interpreter loading the extension will already have all of the libpython symbols.
- Compute the extension suffix suitable for the desired Python ABI and stable/non-stable convention?
These questions are related.
That's something every python module already has to handle. Setuptools handles this automatically, as does maturin. Meson handles it automatically too! CMake doesn't support it, which is why scikit-build and I assume nanobind as well, provide their own cmake modules that define functions which override cmake add_library.
Meson calls this pymod.extension_module()
.
- Set compilation flags required for correctness (
-fno-strict-aliasing
) in the static library part only?
This is a legacy of python 2 which violates the C/C++ language standards. It's not valid code if it has to compile with disabling the mandatory aliasing rules of the C++ language.
Since many people get aliasing wrong even though it breaks the language (and so did CPython 2.7!) -- and because sometimes the fact that your code breaks the requirements is difficult to spot -- compiler vendors offer a compiler extension that tells the compiler to cripple its codegen and produce really slow code that tiptoes around aliasing, adding redundant checks here there and everywhere to check at runtime whether the code actually followed the aliasing rules, and recover if it didn't follow the rules.
Aliasing violations can't come from CPython since CPython fixed that. They can still come from projects' own code, or from their nanobind dependency.
I think you're saying that nanobind needs this extra protection as the cpp_args
kwarg to the library() call for nanobind itself? That can be added.
- Support stable ABI builds?
Meson handles this automatically as part of pymod.extension_module()
. If nanobind needs to be built with the limited API define then it should be manually calculated based on the -Dpython.allow_limited_api=true
builtin option, as meson doesn't handle this for static libraries that use CPython and get linked into extension modules.
It may make sense to extend the meson language to handle static libraries here somehow. I'll think about this...
- Set flags that are a good idea to obtain small binaries when making release builds ('-DNDEBUG' and '-DNB_COMPACT_ASSERTIONS', '-ffunction-sections/-fdata-sections/--gc-sections', various stripping flags -- see nanobind's CMake build system for details on these steps)
This can be added to the declare_dependency()
function call in this PR. Meson does have a builtin -Db_ndebug=true
option which should be respected, though.
- Use the chained fixups optimization for library builds on macOS? Nanobind ships a custom linker response file to enable this.
- Support isolating extensions from others by specifying the
NB_DOMAIN
parameter? This is in general something that almost every extension should do to avoid cross-contamination.
Those are interesting points. The linker response file can probably be handled just like point 5. I'm not sure how to handle point 7 (users could I suppose add that to their own extensions).
Quite a lot of work has gone into the CMake interface produce nice output—I would be concerned about alternatives that produce subpar extensions by missing some of these steps. A good recipe that helped to reproduce the behavior in the Bazel build system (https://github.com/nicholasjng/nanobind-bazel) has been to run nanobind's CMake build sytem on a simple extension in verbose mode and then try to exactly reproduce all of the compiler and linker invocations with their flags on macOS+Windows+Linux (some subtle differences there as well).
I'm happy to provide further detail on anything with the caveat that I'll be traveling in July.
Thanks for your insights!
9. Set flags that are a good idea to obtain small binaries when making release builds ('-DNDEBUG' and '-DNB_COMPACT_ASSERTIONS', '-ffunction-sections/-fdata-sections/--gc-sections', various stripping flags -- see nanobind's CMake build system for details on these steps)
This can be added to the
declare_dependency()
function call in this PR. Meson does have a builtin-Db_ndebug=true
option which should be respected, though.
Does Meson not set most of these in the first place for a release build? I added -DNB_COMPACT_ASSERTIONS but from previous testing the rest were automatically added - so I don't think need to try and manually add here?
Ignore my previous comment - those flags probably came from the CMake wrap I tried before. Will take another look
Two quick responses to @eli-schwartz:
CMake doesn't support it, which is why scikit-build and I assume nanobind as well, provide their own cmake modules that define functions which override cmake add_library.
Incorrect -- CMake handles it too (via Python_add_library
). But the way that CMake does this on macOS (by specifying the legacy -undefined dynamic_lookup
linker flag) is incompatible with the recently added chained fixups linker optimization, so nanobind uses a custom solution. There is lots of discussion about it here.
Aliasing violations can't come from CPython since CPython fixed that.
To my knowledge (and I would be really curious to be pointed to something saying otherwise) Core CPython C code to the cannot be compiled as regular C++ code. It heavily relies on macros that cast and reinterpret data structures that aren't related to each other through inheritance, which violates C++ type punning rules. The -fno-strict-aliasing
is therefore required when compiling CPython-style C code in a C++ context. The reason why nanobind specifies this flag to the "libnanobind" utility library component is because it is written like regular CPython code, making plentiful use of such data structure reinterpretation that would be suspect in C++. But this flag isn't imposed on the remainder of the (user-provided) bindings, so aliasing analysis remains useful there.
To my knowledge (and I would be really curious to be pointed to something saying otherwise) Core CPython C code to the cannot be compiled as regular C++ code. It heavily relies on macros that cast and reinterpret data structures that aren't related to each other through inheritance, which violates C++ type punning rules.
I'm not sure about this. C of course has its own type punning rules. CPython 2.x did have an infamous problem with it (https://peps.python.org/pep-3123/). I'm not aware of any issue in modern CPython.
numpy also has a bunch of C++ which is built without -fno-strict-aliasing
(discussed at https://github.com/numpy/numpy/issues/25004) and nothing has gone wrong there.
Further, I'm not aware of any construct used in CPython which is legal under C aliasing rules but suddenly illegal under C++ rules. Could you point me to it?
@wjakob please link where in the Python documentation C++ compilation requires -fno-strict-aliasing
. C and C++ literally have pretty much the exact same strict aliasing rules. Specifically, the rules that PEP 3123 relies are the same in C++, so your claims require actual evidence.
I would use the string method .version_compare('<3.12')
Any guidance on how to handle the MSVC errors about a debug build with a release Python interpreter?
nanobind| WARNING: Using a debug build type with MSVC or an MSVC-compatible compiler nanobind| when the Python interpreter is not also a debug build will almost nanobind| certainly result in a failed build. Prefer using a release build nanobind| type or a debug Python interpreter.
Should I just force a release build on msvc?
Setting the release buildtype in ci_config.json is definitely a reasonable way to deal with this.
The alpine failure is probably because alpine uses -dev packages just like Ubuntu.
Have a feeling the VisualStudio CI builds need something else besides just Python installed for the import to work. I couldn't find a -dev package in choco - any ideas?
After disabling fail-fast for the VisualStudio matrix (cf. #1561), the x64 job fails too.
Seems like the current VisualStudio x64 failure is reported upstream and a bug with the latest msvc compiler version
And for the x32 job: https://github.com/mesonbuild/wrapdb/pull/1562. With that get similar errors.
Thanks @benoit-pierre . My assumption is those two jobs will continue to fail until a newer msvc release comes out (or we can try to test against an older compiler version)
cc @wjakob