Closed dcbaker closed 2 years ago
This pull request introduces 1 alert when merging 33e06a7c4b27660db4d5b2a1bad7d9d6f5aaee56 into a493c950a73c5924e1e2f9454a5c13d0c7bc93bc - view on LGTM.com
new alerts:
For C++ I'm think using a -Dcython_language=(c|cpp)
might be best, then if you want to override it, you can do something like:
cpp_files = static_library(
'static_library
['foo.pyx'],
override_options : ['cython_language=cpp'],
)
python.extension_module(..., link_with : static_lib)
This pull request introduces 1 alert when merging a5bb3ebcd42032a4b4876a5da8b9f4f3fed1abd3 into a493c950a73c5924e1e2f9454a5c13d0c7bc93bc - view on LGTM.com
new alerts:
I've been testing this PR in https://github.com/rgommers/scipy/tree/meson, which starts porting SciPy submodules to Meson. So far everything works as expected.
I think we'll need to solve https://github.com/mesonbuild/meson/issues/8725 to completely make cython a 1st class language, but I'm glad this is step in the right direction.
How useful would this be in the main tree without the generated code bits? I.e. could that be dropped from this MR's scope and added afterwards?
I could fix the generated code bits to at least be on par with Rust pretty easily, ie you could have a purely generated or purely static build, but not a mix. I would be happy to land this at that point, and we can finish the structured input stuff separately. Let me update this today.
This is now based on #8735
So, the only things that don't work currently that probably should are: 1) we always output C not C++ 2) we always compile in python3 mode, with no way to get python2 mode.
This pull request introduces 1 alert when merging ef5327f3d51e478cd5710da739f7e8f59f7b779e into 4ca9a16288f51cce99624a2ef595d879acdc02d8 - view on LGTM.com
new alerts:
So, the only things that don't work currently that probably should are:
- we always output C not C++
This is something that the directive # distutils: language=c++
would solve, which the cythonize
frontend picks up. The cython
frontend is intentionally dumb and requires this option on the command line. (Although there is a bit of room for debate on this difference.)
2. we always compile in python3 mode, with no way to get python2 mode.
The can always be changed per-file with the language_level
directive. (If I understand correctly that that's what you are referring to here.)
I've added the an option for setting the python target (2 vs 3). I've called it cython_language
, but I'm opened to painting the bike some other color.
For C vs C++ I have something that works, but I have some annoying corners to sort out.
@rgommers, for y'all how important is C++ support? I.e, can we land support for cython without it and get back to that later?
@rgommers, for y'all how important is C++ support? I.e, can we land support for cython without it and get back to that later?
It's a small percentage of all files, so seems fine to land without it - we can still use custom_target
for the .pyx
files that target C++.
Here is the list of files containing "distutils: language=c++"
. Note that most are in .pxd
files, which we don't need special handling for (they simply get used when you run cython --cplus
on some .pyx
file).
Here is the list of all our Cython files:
files containing "distutils: language=c++". Note that most are in .pxd files
The distutils
directive isn't looked at in .pxd
files, simply because they are not handled by distutils
.
how important is C++ support? I.e, can we land support for cython without it and get back to that later?
Is there a reason not to use the cythonize
frontend? For the simple case of compiling one source file, it's just as good as the cython
frontend, but it considers the #distutils
directive. If you pass --force
, it will also ignore the file timestamps and always overwrite what's there.
Is there a reason not to use the
cythonize
frontend? For the simple case of compiling one source file, it's just as good as thecython
frontend,
Not sure I agree with that. cythonize
invokes setuptools
under the hood right? Invoking a second build system from within another build system is not a good idea, you're going to get different compile flags, issues with caching, and all sorts of hard to debug issues down the line.
cythonize
invokessetuptools
under the hood right?
Only if you ask it to build a complete extension module for you. If you just invoke it like cythonize module.pyx
, without -i
or -b
, then it will just generate the .c/.cpp file for it and be done.
If you just invoke it like
cythonize module.pyx
, without-i
or-b
, then it will just generate the .c/.cpp file for it and be done.
Ah, didn't realize that. It seems to be missing flags though, like --output-file
(this one is a hard necessity, because Meson doesn't allow in-tree builds) and --include-dir
. It's not a superset of cython
it seems, just a different thing altogether?
It seems to be missing flags though, like
--output-file
Oh, yeah, right. That sort-of goes against the idea of a "here's a bunch of files, do the right thing" kind of tool. Single file compilation is the normal case for cython
, but more of a special case for cythonize
. All of these settings go into generic options.
For the include path, there's -s include_path=…
, for the output directory -s output_dir=…
, for a single output file -s output_file=…
.
Yes, there's a bunch of good reasons to not use cythonzie and invoke cython directly:
1) we already have the infastructure in place 2) cythonize goes and does stuff behind our backs, that's not nice 3) we want to apply the CFLAGS or CPPFLAGS to the C or C++ files generated 4) we want to ccache the generated files if possible 5) we want to control the C or C++ compiler that's being used, including arguments passed (this is critical for cross compiling)
It does appear that I haven't wired up include directories though... I need to fix that.
Sorry. All of this sounds like FUD to me.
Yes, there's a bunch of good reasons to not use cythonzie and invoke cython directly:
1. we already have the infastructure in place
It's just a different command that you call. I don't see any infrastructure involved here.
2. cythonize goes and does stuff behind our backs, that's not nice
I don't know what you are referring to here. Maybe that it won't regenerate a target file if it knows that it doesn't have to (unless you --force it to)? Sounds more like a feature than a drawback to me. Or the fact that it knows whether the user wants C or C++ mode? Isn't that what you're after?
3. we want to apply the CFLAGS or CPPFLAGS to the C or C++ files generated
Those are environment variables. They get passed through if you ask cythonize to do the build for you. They don't matter to cythonize if you don't request that.
4. we want to ccache the generated files if possible
That's ok, cythonize doesn't care. Only the invoked C compiler does, whether called by cythonize or not.
5. we want to control the C or C++ compiler that's being used, including arguments passed (this is critical for cross compiling)
From what I understand, you do not want cythonize to invoke the C compiler for you. Then just don't instruct it to.
meson wants to be the coordinator for all compile stages. meson wants to generate ninja rules, just like any Makefile generator would, which:
using meson's choice of $CC which is not actually available as an environment variable because Meson Does Not Do Environment Variables™ (at build time at least, and reluctantly at configure time). In part because, well, ninja doesn't really support them. And wrapping commands via /usr/bin/env VAR=VAL cmd
and meson --internal basically-env-but-cross-platform.py VAR=VAL cmd
is complicated and introduces overhead. And passing them as cythonize --cc="the-chosen-cc" --cflags="the-chosen-cflags"
is not particularly greater and has already caused problems for e.g. gtkdoc and stuff which misparse command lines.
And none of this solves the problem where ninja should be able to report job progress and tell you when stuff is out of date without running the cythonize tool, which is important to meson because the point of meson using ninja is in part to guarantee no-op builds are instant, and running cythonize
to check is not instant and anyways is recorded as a running build job, not "everything is up to date".
If cythonize can be told to not build, only generate the .c files, then I see no problem. But it's not clear to me what it adds over the cython tool. If this is about detecting whether to generate .c or .cpp output files, then meson still needs to somehow know which one to generate output rules for, which is configure-time data. So this would need to either be:
As I said, I do have a branch that makes the C vs C++ selection work, but there's some corner cases I still haven't worked out.
Another thing to consider here, is that meson considers the target to be the lowest level of granularity, not the file. Meson does not, by design, allow you to pass specific arguments only to one file in a target, all files in the target are built with the same arguments. That means that if you have a target with two pyx files, they most both be compiled as C or both as C++, or you need to use an intermediate target. So if cythonize reads # distutils: language=c++
out of one file in a target and creates a .cpp file then the compilation will fail, as Meson expects a C file. Changing that design would a monumental task, and would probably not be accepted anyway.
Something I just ran into with this branch: when the cython
invocation fails, the ninja build silently continues and no Cython errors are shown. This makes it quite hard to debug.
An easy way to reproduce is to add a line:
from . cimport doesnt_exist
in a generated .pyx
file.
The actual code I was working on (from https://github.com/rgommers/scipy/blob/meson/scipy/linalg/meson.build):
# _decomp_update
_decomp_update_pyx = custom_target('_decomp_update',
build_by_default : true,
output : '_decomp_update.pyx',
input : '_decomp_update.pyx.in',
command : [py3, tempita, '@INPUT@', '-o', '@OUTDIR@'],
)
py3.extension_module('_decomp_update',
_decomp_update_pyx,
dependencies : py3_dep,
include_directories : inc_np,
install : true,
subdir : 'scipy/linalg')
Your project doesn't reproduce this issue since it fails immediately with
ninja: error: '_decomp_update.pyx', needed by 'scipy/linalg/_decomp_update.cpython-39-x86_64-linux-gnu.so.p/_decomp_update.pyx.c', missing and no known rule to make it
This is being caused by the rules:
build scipy/linalg/_decomp_update.pyx: CUSTOM_COMMAND ../scipy/linalg/_decomp_update.pyx.in | ../scipy/_build_utils/tempita.py /usr/bin/python3
COMMAND = /usr/bin/python3 ../scipy/_build_utils/tempita.py ../scipy/linalg/_decomp_update.pyx.in -o scipy/linalg
description = Generating$ _decomp_update$ with$ a$ custom$ command
build scipy/linalg/_decomp_update.cpython-39-x86_64-linux-gnu.so.p/_decomp_update.pyx.c: cython_COMPILER _decomp_update.pyx
ARGS = --fast-fail -3 -I/usr/lib/python3.9/site-packages/numpy/core/include -o scipy/linalg/_decomp_update.cpython-39-x86_64-linux-gnu.so.p/_decomp_update.pyx.c
Your project doesn't reproduce this issue
Argh, I was picking up an older version of this branch, sorry for the noise.
This is being caused by the rules:
Thanks. Am I doing something wrong then? It looks to me like _decomp_update.pyx.c: cython_COMPILER _decomp_update.pyx
in the rule you show above is incorrect. Since _decomp_update.pyx
is a generated file, I believe the rule should be:
_decomp_update.pyx.c: cython_COMPILER scipy/linalg/_decomp_update.pyx
As far as I can tell, I'm doing the same thing as the ct_ext
test case in cython/2 generated sources/meson.build
here (unless using @OUTDIR@
instead of @OUTPUT@
matters, but it shouldn't).
If I manually edit ninja.build
to update that rule, I do get the expected Cython failure:
[71/118] Compiling Cython source scipy/linalg/_decomp_update.pyx
FAILED: scipy/linalg/_decomp_update.cpython-39-x86_64-linux-gnu.so.p/_decomp_update.pyx.c
cython --fast-fail -3 -I/home/rgommers/anaconda3/envs/scipy-meson/lib/python3.9/site-packages/numpy/core/include -o scipy/linalg/_decomp_update.cpython-39-x86_64-linux-gnu.so.p/_decomp_update.pyx.c scipy/linalg/_decomp_update.pyx
Error compiling Cython file:
------------------------------------------------------------
...
# and object. In this file, only NULL is passed to these parameters.
cdef extern from *:
cnp.ndarray PyArray_CheckFromAny(object, void*, int, int, int, void*)
cnp.ndarray PyArray_FromArray(cnp.ndarray, void*, int)
from . cimport cython_blas as blas_pointers
Meson doesn't try to parse your command and always assumes the output file is in the current_build_dir() and the filename. It's very clear that this is a bug in the PR as it probably doesn't correctly handle generated outputs and the test case accidentally works because it doesn't use subdir()
That being said it would be easier to evaluate the bug you wanted to report rather than the bug you found along the way, if you had a minimal reproducer meson.build that didn't require editing build.ninja to view its effects.
Something I just ran into with this branch: when the cython invocation fails, the ninja build silently continues and no Cython errors are shown. This makes it quite hard to debug.
I'm confused, your later comment shows a cython error occurring. Is this not what you meant?
Thanks for the explanation @eli-schwartz.
I'm confused, your later comment shows a cython error occurring.
Yes, that's expected given the state of my branch - the code was using a cimport
of another module that I had not yet added support for. There's no bug there.
It's very clear that this is a bug in the PR
That's what I suspected, but wasn't sure. Once that bug is fixed in this PR, I'm unblocked.
Okay, thanks for clarifying! Was just trying to figure out if I should keep investigating your original suspicion that there were cases of failure-and-silently-continuing.
whether to generate .c or .cpp output files, then meson still needs to somehow know
That's reasonable. Seems like it doesn't matter which you choose then, and cython
wins by the simpler interface.
Regarding dependency collection, Cython does this in the Cython.Build.Dependencies
module, in DependencyTree.all_dependencies()
. Probably something that could be exposed in a simpler way.
I've updated the generated file support so it should handle generated targets correctly when they're in a subfolder. I get bit by that every time :(
@scoder one of the things that would make everything much better is to teach cython to output a make compatible dependency file. I looked at DependencyTree.all_dependencies()
but I'm not sure how to translate the information in the DependencyTree into something ninja can consume.
I've updated the generated file support so it should handle generated targets correctly when they're in a subfolder.
I gave this a quick try, this gives me:
File "[...]/site-packages/mesonbuild/backend/ninjabackend.py", line 1593, in generate_cython_transpile
ssrc = os.path.join(gen.subdir, ssrc)
AttributeError: 'CustomTargetIndex' object has no attribute 'subdir'
when running meson setup build
and using a custom_target
with multiple outputs, one of which is a .pyx
file, as documented at https://mesonbuild.com/Generating-sources.html#generating-multiple-files-at-a-time
If that doesn't ring any immediate bells, I can try to turn the offending code into a new test case tomorrow.
@rgommers fixed in the lastest push.
CustomTargetIndex doesn't have a subdir
attribute, but it does have a get_subdir
method.
That worked, thanks!
cython to output a make compatible dependency file
Yeah, there's even one of those really old feature requests for this: https://github.com/cython/cython/issues/1214
Looks like we never implemented this. Definitely useful, and shouldn't be difficult from the result of all_dependencies()
.
What's the current plan on getting this merged?
I need to get the CI passing, then it's ready from my point of view
@rgommers is this looking good for you?
@rgommers is this looking good for you?
Yes, I'm pretty happy with this. It works for all the simpler cases I'd expect it to work on. And the more complicated ones, like codegen of multiple files that depend on each other and other files in the tree, is the separate "structured inputs" PR gh-8775.
Awesome to have this merged. Thanks @dcbaker, all!
Okay, I stopped thinking about it and started writing code. This is not complete, there's a fair number of things not done (listed below). Currently, this is just enough to build one trivial test case, but I think it touches all the places that need to be changed, even if there is more work to do. I'm not sure how much more time I'll be able to spend on this, but here's a starting point if someone else wants to take it up.
TODO: