Open rgommers opened 2 years ago
Also perhaps useful info, here are the docs for how CMake does it: https://cmake.org/cmake/help/latest/module/FindPython3.html#findpython3.
find_package (Python3 COMPONENTS Interpreter NumPy)
target_link_libraries(... Python3::NumPy)
Variables:
Python3_NumPy_FOUND
Python3_NumPy_INCLUDE_DIRS
Python3_NumPy_VERSION
Version number will indeed be useful. CMake doesn't seem to have support for libnpymath
and libnpyrandom
.
modules seems like the correct approach here.
BTW if numpy could add pkg-config files for at least the case where it is built from source and installed as a system package that would be super convenient, we could use the numpy.get_include stuff as a fallback. ;)
Custom lookup dependencies can have subdependencies, i.e. I think it makes sense that numpy should always have a recursive dep on python?
BTW if numpy could add pkg-config files for at least the case where it is built from source and installed as a system package that would be super convenient, we could use the numpy.get_include stuff as a fallback. ;)
Yes, that does seem like a good idea. I have thought about it, but would prefer to do it only once NumPy is switching over to Meson itself. Otherwise I'll have to implement it twice, and adding new features to numpy.distutils
is not a very enjoyable experience.
Custom lookup dependencies can have subdependencies, i.e. I think it makes sense that numpy should always have a recursive dep on python?
Yes, that makes sense to me - the NumPy C API and the C code that f2py generates both depend on the CPython C API.
@eli-schwartz I have a WIP branch here and could use a few pointers: https://github.com/mesonbuild/meson/compare/master...rgommers:meson:numpy-dependency?expand=1
Does that look in the right direction, or should it go into modules/python.py
? How do I get at the python
executable found by dependency('python')
?
How do I get at the
python
executable found bydependency('python')
?
The python module's .dependency()
method is a bit different from the one that is a global dependency, for instance it doesn't allow specifying which python executable to use.
This actually plays into the issue of where to put it. There's two implementations of a python dependency that aren't necessarily in sync, and we should consolidate them in, say, dependencies/python.py
. I wonder, can we sneak a PythonExternalProgram in as an optional value...
This actually plays into the issue of where to put it. There's two implementations of a python dependency that aren't necessarily in sync
Aside from python
vs. python3
? The latter can probably be removed by now? There's also what looks like a leftover thing in dependencies/misc.py
.
I don’t know if it ever landed, but I had code that made them all the same with a wrapper around the misc implantation (moved to a new module)
I'm pretty sure it did not land, because it's currently still a mess.
But this is what I've been working on today: https://github.com/eli-schwartz/meson/commits/python-dependency-refactor
The third commit is surely broken... it's half-finished work.
I was bitten this weekend by the numpy detection logic when trying to move Void Linux from SciPy 1.8.1 to SciPy 1.9.0 with Meson. I don't have much to add from my experience:
/usr/<triple>
to all of the paths returned by numpy.get_include()
still leads find_library
to find the native npymath
(and, I presume, npyrandom
) rather than the right version in the build prefix. I didn't dig too deeply into this.import numpy
, it wants to load shared objects that might be for the wrong architecture.)I was just going to file an issue tagging @eli-schwartz for advice. As he's already thinking about this problem, I'll just say "me too" and watch what happens. We're sticking with the distutils build in Void for now.
It's not entirely clear to me how this works... You have the build machine python but with the cross-compile numpy and you want to build a cross-compile SciPy? If you were using the cross-compile python then I'd be hopeful that this works fine.
Of course, not having to run cross-compile tools via qemu-user is the exact reason why pkg-config is so much better than config-tool stuff (shakes fist at llvm-config) so there is that...
In Void, we don't run the build Python when cross-compiling native Python extensions. Instead, we use the host Python but override the sysconfig and set custom prefix and path variables to allow the Python packaging tools to gather information (field sizes, shlib suffix, etc.) about the build platform rather than the host platform. This "mostly" works for setuptools/distutils although, for the very issues we're seeing with the move to meson in SciPy, we hack around some search paths when building things that link with NumPy. Yes, this can be fragile, but it's Python packaging...
Void also installs wrappers for a lot of tools (pkg-config comes to mind) ahead of the default system path to make sure that we are searching the right paths for build dependencies.
Instead, we use the host Python but override the sysconfig and set custom prefix and path variables to allow the Python packaging tools to gather information (field sizes, shlib suffix, etc.) about the build platform rather than the host platform.
Yes, that's clearly not going to work with the way I currently implemented numpy
dependency detection inside SciPy. When we have a builtin numpy
dependency in Meson, it should work without having to run any Python code (host or build); the relevant numpy paths are deterministic given the host platform site-packages location. I could probably even get rid of that numpy.get_include()
usage now, this is just the first actual bug report that motivates doing so.
@ahesford part of the things you have in your bullet list above need updating for Meson I think. One of the benefits of moving to Meson is proper cross-compiling support via a cross build definition file (https://mesonbuild.com/Cross-compilation.html). I'd be quite interested in making this work in SciPy itself (xref https://github.com/scipy/scipy/issues/14812) and having a SciPy CI job which cross-compiles (e.g. a two-stage GitHub Actions job, x86-64 to aarch64 build, then run basic tests in an aarch64 container - or whichever other platform combo makes sense). Would you want to collaborate on that? Or if not, can you open a SciPy issue with as much detail as possible (a lot of which you have above already) so I can add it to our tracking issue?
@ahesford part of the things you have in your bullet list above need updating for Meson I think. One of the benefits of moving to Meson is proper cross-compiling support via a cross build definition file (https://mesonbuild.com/Cross-compilation.html). I'd be quite interested in making this work in SciPy itself (xref scipy/scipy#14812) and having a SciPy CI job which cross-compiles (e.g. a two-stage GitHub Actions job, x86-64 to aarch64 build, then run basic tests in an aarch64 container - or whichever other platform combo makes sense). Would you want to collaborate on that? Or if not, can you open a SciPy issue with as much detail as possible (a lot of which you have above already) so I can add it to our tracking issue?
Sorry for the delayed response. I've opened https://github.com/scipy/scipy/issues/16783 to document SciPy specifics and get people thinking about possible resolutions. I'm happy to collaborate on a solution; SciPy and NumPy are critical tools for me (which is why I maintain the major parts of the Python numeric stack for Void) and my increasing familiarity with meson convinces me that it's more pleasant than pretty much all other build systems.
In my initial issue description here, I missed that python.find_installation
already has a modules
keyword. The release note for that feature contains:
py = import('python').find_installation('python3', modules : ['numpy'])
Making numpy
a separate dependency is kind of a pain, because we then don't know to which Python interpreter it belongs - which starts to matter very quickly when cross-compiling. We may need to run numpy.f2py
as a code generator, so that would need it from the native Python (see https://github.com/scipy/scipy/blob/main/tools/generate_f2pymod.py#L278-L283 for SciPy) and we also need its include dir from the host Python. While for numpy.get_include()
to get at the headers we need to use the NumPy C API, we need it from the cross Python. See https://github.com/scipy/scipy/blob/main/scipy/meson.build#L39-L84
crossenv
appears to have a bunch of hacks to work around the issues, but it seems to me like what we really need is to look for two Python interpreters in general. For SciPy in particular:
py_mod = import('python') # we just have one of these ....
# the host Python
py = py_mod.find_installation(
modules : ['numpy', 'pybind11'],
pure: false)
py_dep = py3.dependency()
# We need the native Python from the build machine to run som code generators:
py_native = py_mod.find_installation(
native: true, # FIXME: this keyword does not exist
modules: ['numpy', 'pythran'], # code generation tools
pure: false,
)
Looking at https://mesonbuild.com/Python-module.html, there is no way to actually ask for the native Python during a cross build. It's not clear to me if this is the best way, or this is supposed to work differently. @eli-schwartz WDYT?
I assume the code generation tools can run on either the native or the cross compile python, and still produce the same outputs.
Do you need to guarantee that exactly the same version of each is used? Do the versions not particularly matter? Do they matter, but only a minimum version is necessary, not a maximum version?
I assume the code generation tools can run on either the native or the cross compile python, and still produce the same outputs.
Yes, I think that is true. Except that one doesn't want to require QEMU to actually run them, hence the desire to allow specifying that they come from the native Python.
Do you need to guarantee that exactly the same version of each is used? Do the versions not particularly matter? Do they matter, but only a minimum version is necessary, not a maximum version?
Not always needed to use the same version, but it should be possible. However, I think that's up to the user to set up the build environment - probably no need for Meson to enforce that. I think that's the answer for all of these. meson-python
already uses a native file to control the exact Python interpreter selection. I'm not sure anything is needed beyond that.
https://conda-forge.org/docs/maintainer/knowledge_base.html#cross-compilation already has explicit the two sets of dependencies when cross-compiling, and distinguishes between python
as a build-only dependency (for code generators I guess) and a runtime dependency: https://conda-forge.org/docs/maintainer/knowledge_base.html#build-matrices.
In that case, I wonder if it makes sense to just use find_program('pythran', native: true)
and find_program('f2py', native: true)
?
Inside of build isolation, the $PATH is now set up to actually find those properly, which required a fix in pip that @dnicolodi made. This is important for meson-python for finding Meson itself, although only if Meson isn't pre-installed at the OS level before performing a build.
(People doing cross builds are probably not using build isolation 😜 and $PATH is generally guaranteed to have native executables. But you can also use the machine file to define those executables too, of course.)
I'm not sure I completely understand what this issue is proposing, however, I think that there must be a clear distinction between code generation tools and libraries. For code generation tools, find_program('tool')
needs to work. As these are code generation tools, for cross compilation it does not make much sense to have these tools defined in the cross file, thus native: true
should not even be necessary. If it does not it is an issue on how the tool is deployed on the build system or a more general issue with the tool itself. Working around it in Meson does not seem to be a good or sustainable strategy. I think that cython
, pythran
and f2py
all work in this way (although they need to be installed for the host Python).
Libraries are another matter. Unfortunately, AFAIK there isn't standardized a way for Python packages to expose this information. What NumPy does is not optimal because it requires to execute Python code, which complicates cross compilation (in general the cross compilation story for Python packages is not great). One possible solution to investigate could be to add the required information to the wheel metadata (IIUC the wheel metadata can be extended beyond what is defined in the PEPs). Doing would possibly open the door to read the metadata fields without running target system code.
incdir_numpy = run_command(py3, [ '-c', 'import os; os.chdir(".."); import numpy; print(numpy.get_include())' ], check: true ).stdout().strip() inc_np = include_directories(incdir_numpy)
I've seen this code replicated in a few projects, but it is problematic. First, it requires the NumPy header directory to be a subdirectory of the Meson project, which is everything but guaranteed. I think
np_dep = declare_dependency(includes: incdir_numpy)
is the way to go. Second, even in the case the NumPy includes are installed in a subdirectory of the Meson project, the code executed to get the header needs to return a relative path, and it does not, at least on my system. This code does:
incdir_numpy = run_command(py3,
[
'-c',
'import os, numpy; print(os.path.relpath(numpy.get_include()))'
],
check: true
).stdout().strip()
I've seen this code replicated in a few projects, but it is problematic. First, it requires the NumPy header directory to be a subdirectory of the Meson project, which is everything but guaranteed. I think
The chdir definitely doesn't say anything about it being or not being a subdirectory of the meson project. I think it's an attempt to evade python's default PYTHONPATH allowing the project source code to be "accidentally imported" when you don't want that.
As for relative vs. absolute, you're "supposed to" have your virtualenv with numpy installed be somewhere other than inside your source tree. Because get_include() returns an absolute path, and that's completely okay if that path is, say, /usr/lib/python3.10/site-packages/numpy/core/include
Aside: IMO the proper solution for pybind11 is to handle it like a config-tool, and I actually have that all planned out as soon as my python dependency refactor PR is merged.
The chdir definitely doesn't say anything about it being or not being a subdirectory of the meson project. I think it's an attempt to evade python's default PYTHONPATH allowing the project source code to be "accidentally imported" when you don't want that.
I'm not sure this is the case: it would matter only inside a project that has a numpy
top level directory that is also a Python package, and this is very unlikely to be the case for anything else than NumPy itself. But it is a bit silly for NumPy to look up its own headers importing numpy
. I've no idea what the chdir()
is trying to accomplish there.
As for relative vs. absolute, you're "supposed to" have your virtualenv with numpy installed be somewhere other than inside your source tree. Because get_include() returns an absolute path, and that's completely okay if that path is, say, /usr/lib/python3.10/site-packages/numpy/core/include
Right (I find the error message Meson spits out when the path is absolute but the directory inside the source tree always confusing). But using declare_dependency()
the requirement for the include path to be relative if inside the source tree and absolute when outside is relaxed, thus there is no requirement on how the virtualenv ere set up. I always have the virtual env for a project instantiated inside the project source directory. Imposing a specific layout for the virtual envs is a bit annoying.
For absolute vs. relative path, please see https://github.com/scipy/scipy/issues/16312. Neither is great, but as it stands absolute paths are preferred. I suggest keeping the discussion on that particular code construct on that SciPy issue.
I'm not sure I completely understand what this issue is proposing, however, I think that there must be a clear distinction between code generation tools and libraries. For code generation tools,
find_program('tool')
needs to work.
Okay thanks, I think this sounds good. You are both saying the same thing here - all code generation tools should be using find_program
, so there's no need to get explicitly at the native interpreter within meson.build
files.
We still have to choose between:
py = import('python').find_installation('python', modules : ['numpy'])
and
numpy_dep = dependency('numpy')
npymath_lib = dependency('numpy', modules: 'npymath')
npyrandom_lib = dependency('numpy', modules: 'npyrandom')
If the former is preferred, that still leaves us with how to get at the include dirs without running the non-native interpreter. Maybe this will work:
py = import('python').find_installation('python', modules : ['numpy'])
py_dep = py.dependency()
numpy_incdir = py.get_path('platlib') / 'numpy' / 'core' / 'include'
but that's still very constraining, and hardcodes paths that the user ideally wouldn't know about. I'm not sure if that can be improved upon. dependency('numpy')
is a lot more flexible - I just had the implementation issue that the dependency provider for numpy
should get at the detected Python install from import('python').find_installation
.
Libraries are another matter. Unfortunately, AFAIK there isn't standardized a way for Python packages to expose this information.
Indeed.
What NumPy does is not optimal because it requires to execute Python code, which complicates cross compilation (in general the cross compilation story for Python packages is not great).
100% agreed, but I'll note that this issue aims to sidestep that requirement. I know that numpy.get_include()
returns path-to-numpy/core/include
, so I want to encode that knowledge into Meson itself so there is no longer a need to either run the interpreter or to hardcode the path in packages that depend on numpy.
One possible solution to investigate could be to add the required information to the wheel metadata (IIUC the wheel metadata can be extended beyond what is defined in the PEPs). Doing would possibly open the door to read the metadata fields without running target system code.
That'd be a nice enhancement in the future - but not required for this issue.
I don't think that my proposal for how to solve absolute vs relative include path passed to include_directory()
vs using declare_dependency()
is unrelated to the topic of this issue. You are proposing to have dependency('numpy')
be a dependency with a custom lookup functionality. This can only return the same type of object that declare_dependency()
gives you. If dependency('numpy') is a solution,
declare_dependency(includes: numpy_incdir)` is at least going in the right direction.
I just had the implementation issue that the dependency provider for
numpy
should get at the detected Python install fromimport('python').find_installation
.
I don't think this is going to fly as there can be more than one Python installation obtained from the Meson's python module (ie a python2 and a python3 installations). The only other interface I can think about that may solve the issue is:
py = import('python').find_installatio()
python_with_numpy_dep = py.dependency(modules: 'numpy')
where a modules
argument is added to the python_installation
object dependency()
method. When the modules
argument is specified, the dependency
object returned is augmented with the information relative to the specified modules. This could also allow:
python_with_numpy_and_libnpmath = py.dependency(modules: 'numpy.core.npmath')
or something similar.
However, this seems like al lot of work (and a lot of not really immediate interfaces to document and remember) just to avoid users to have to spell
py.get_path('platlib') / 'numpy' / 'core' / 'include'
If
dependency('numpy') is a solution,
declare_dependency(includes: numpy_incdir)` is at least going in the right direction.
I will give this another try. I think I tried all permutations at the time, but maybe it will work with relative paths. I'll note that the chdir
is to avoid having scipy's signal
and io
submodules shadowing the stdlib modules of the same name.
However, this seems like al lot of work (and a lot of not really immediate interfaces to document and remember) just to avoid users to have to spell
py.get_path('platlib') / 'numpy' / 'core' / 'include'
I'll note that there's a whole mess here that I didn't get into, with Python packages possibly being installed not under the regular site-packages (/ platlib
) but in the user dir or another dir on sys.path
. When running Python code inside Meson, one can handle all that and check that the returned numpy/core/include
directory actually exists.
I will give this another try.
Done in https://github.com/scipy/scipy/pull/18006. It's not pretty, but it seems to work.
I can't say I follow all the inside baseball in this thread, but I thought I'd ask you experts what the current recommended approach to using NumPy's Array API from meson is.
Specifically, I'm working on using meson-python to build a Python wrapper for a C library. I want to use the Array API to percolate some stuff back up from C to Python, hence need to #include <numpy/arrayobject.h>
.
My hope was that I'd be able to numpy_dep = dependency('numpy')
, but it appears that something like this feature is still in the works.
@sampotter yes dependency('numpy')
is still in the works. For now I recommend to do what SciPy does: https://github.com/scipy/scipy/blob/main/scipy/meson.build#L30-L73
It seems to me that #12799 fixed this with support for numpy-config
Good point, I forgot to link this issue. As commented on gh-12799, it'd be nice to also make things work for older NumPy versions though, and then close this issue.
This is an issue to keep track of a feature I'd like to have: making
dependency('numpy')
work, similar to the examples in https://mesonbuild.com/Dependencies.html#dependencies-with-custom-lookup-functionality.First, here is my current code for NumPy support in my SciPy
meson
branch - it works, but isn't pretty:Note that:
numpy.get_include()
is currently the only correct way to obtain the main NumPy include directory (there's no solid pkg-config support - although there are some remnants, they are untested/unused AFAIK)numpy.f2py.get_include()
to do the same for F2PY include directories was added in numpy1.21.1
. The code above uses relative directories because it needs to support older NumPy versions.libnpymath.a
can be linked against, it's located atnumpy/core/lib/libnpyrandom.a
. It's quite old, can be assumed to always be present.libnpyrandom
can be linked against, it's located atnumpy/random/lib/libnpyrandom.a
. It was introduced in numpy1.19.0
(2020).I think, based on reading the Meson docs, this is the desired support:
Note that
f2py
,libnpymath
andlibnpyrandom
will always be present. I haven't usedmodules:
before, so I'm not 100% sure that's the right thing here.Cc @eli-schwartz, who offered to support/review this new feature.