Closed lesteve closed 3 months ago
I think the lines:
+ /home/lesteve/micromamba/envs/scikit-learn-dev/bin/ninja
ninja: no work to do.
would be ok, if they were prefixed by some user friendlier message:
meson-python: importing sklearn installed in editable mode
+ /home/lesteve/micromamba/envs/scikit-learn-dev/bin/ninja
ninja: no work to do.
But I agree that the repeated message is a bit annoying when using parallelism that spawn Python workers that import sklearn
concurrently.
But I am also fine with what @lesteve suggests in this PR because it fixes both problems at once.
Thanks! I agree that this message can be confusing. I hadn't seen it printed multiple times yet, but that makes it worse. So looks like a good idea to suppress it.
It would be great if someone with the necessary rights could enable the Github actions workflow :pray:!
I can reproduce the Alpine failure in a Docker container and the bad news is that ninja -n
seems to segfault (ninja
alone works fine). The ninja version is 1.9 (released January 2019) and I have seen at least one test in meson-python that is skipped if the ninja version < 1.10, so maybe this could be an option?
My basic gdb skills only allowed me to get this information:
(gdb) r
Starting program: /usr/bin/ninja -C build/cp312 -n
warning: Error disabling address space randomization: Operation not permitted
ninja: entering directory 'build/cp312'
Program received signal SIGSEGV, Segmentation fault.
0x00007b65464a9458 in strlen (s=<optimized out>,
s@entry=0x8 <error: Cannot access memory at address 0x8>) at src/string/strlen.c:17
warning: 17 src/string/strlen.c: No such file or directory
Suggestions how to get more info more than welcome ...
I can reproduce the Alpine failure in a Docker container and the bad news is that
ninja -n
seems to segfault (ninja
alone works fine). The ninja version is 1.9 (released January 2019) and I have seen at least one test in meson-python that is skipped if the ninja version < 1.10, so maybe this could be an option?
This is:
https://github.com/michaelforney/samurai/issues/66 https://github.com/michaelforney/samurai/issues/81 https://github.com/michaelforney/samurai/commit/fb61f22c7e690715d309c41812412c4f432ef53a
Possibly something Alpine should backport...
I have skipped the test on Alpine for now based on detecting musllinux.
To know whether the test pass on Windows it would be nice that someone enable Github workflows. Full disclosure: locally the Windows tests don't pass and I don't really know why yet.
To know whether the test pass on Windows it would be nice that someone enable Github workflows.
Done. this can be a pain because the approve button has to be pressed after every push. If you want to submit a trivial PR to side-step this problem, I'm happy to merge that straight away (e.g, in the readme, change to use Meson
to using Meson
).
Thanks, the Windows tests fail in a similar way locally, the dry-run command fails with a exit code 1 and I can't find a reason in the captured stderr as to why, but maybe I am missing something ...
I'll be offline next week but I can try to pick this up when I come back. Any meson-python maintainer should not hesitate to push into my branch if that makes things easier.
If you want to submit a trivial PR to side-step this problem, I'm happy to merge that straight away (e.g, in the readme, change
to use Meson
tousing Meson
).
@rgommers I have opened https://github.com/mesonbuild/meson-python/pull/585 with this change.
@rgommers I have opened #585 with this change.
Sounds good - merged now. Should fix the issue with CI running here.
This is:
michaelforney/samurai#66 michaelforney/samurai#81 michaelforney/samurai@fb61f22
Possibly something Alpine should backport...
@eli-schwartz Are you implying that Alpine uses samurai but calls it ninja? I wonder if we can pull in a fixed package in our CI image...
Yes, it seems that installing ninja on Alpine Linux puts the executable in a location outside $PATH
and that the samuari package install a symlink that makes ninja actually be samurai. samurai seems to be installed with the build-base package, otherwise I don't understand why we get it in our CI images. Anyhow, installing ninja-is-really-ninja
should make the ninja
command actually be what is expected.
The bug @eli-schwartz mentions was fixed in samurai back in 2021, but samurai hasn't seen a release since 2020.
The CI failures for the development meson version are also main
and do not seem to be related to this PR.
Edit: the development meson builds seem to have been fixed. I don't really know what the freebsd issue is about but it does not seem related to this PR either, see build log
Failed to start an instance: INVALID_ARGUMENT: Not Found 404 Not Found POST https://compute.googleapis.com:443/compute/v1/projects/cirrus-ci-community/zones/us-central1-c/instances
{ "error": { "code": 404, "message": "The resource 'projects/freebsd-org-cloud-dev/global/images/family/freebsd-13-2' was not found", "errors": [ { "message": "The resource 'projects/freebsd-org-cloud-dev/global/images/family/freebsd-13-2' was not found", "domain": "global", "reason": "notFound" } ] } }
Please don't merge the main branch into your branch the next time: it makes rebasing the commits harder without reason.
I added the missing tests and fixed up the last issue and submitted as #594
Why the current behaviour is problematic
The "ninja: no work to do" output can be confusing as noted for example by @ogrisel in https://github.com/scikit-learn/scikit-learn/pull/28040#issuecomment-1886676007. @eli-schwartz mentioned that it could potentially be changed in meson-python https://github.com/scikit-learn/scikit-learn/pull/28040#issuecomment-1888470251.
Personally, when working on scikit-learn code, I find it reassuring to see some output if the sklearn import takes a while and get the feed-back that some compilation is happening. In the case when there is nothing to do I would rather have nothing printed. One thing I realised recently is that the "no work to do" output can be printed multiple times if you use subprocesses (makes complete sense since sklearn is imported in each subprocess). This can happen easily in scikit-learn for example with this snippet.
Output is:
Current implementation
The simplest thing I could find was to first execute the
ninja
command in dry-run mode and to check if "no work to do" is in the captured stdout. This may be a bit brittle in case the ninja output changes. Maybe there is a better way to check withninja
that there is no work to do?There does not seem to be any test right now for the editable verbose behavior, but I can try to add some if you think this is worth it.
Alternative implementation
If you think the overhead of having a
ninja
dry-run command is too much overhead, one possible alternative implementation would be to capturestdout
withp = subprocess.Popen(..., stdout=subprocess.PIPE)
and look at the first two lines of output. In my experience, this kind of manipulation are always more complicated than you would hope, but maybe I have been doing it wrong all this time ...The kind of complications I have in mind:
p.stdout.read
is blocking so in the case that there is no output for some reason, it will wait for ever. Maybe we know there will always be some output with theninja
command. You can also put you can usefcntl
to put stdout in non-blocking mode.