ninja-build / ninja

a small build system with a focus on speed
https://ninja-build.org/
Apache License 2.0
11.08k stars 1.59k forks source link

Ninja fails non-deterministically if `/dev/null` is not available #1750

Open cheshirekow opened 4 years ago

cheshirekow commented 4 years ago

This took me a long time to debug. Evidently in src/subprocess-posix.cc /dev/null is opened over stdin. I'm not sure what the expected behavior is, but if /dev/null is not available, posix_spawn will (evidently non-deterministically) return ENOENT on my system (x86 ubuntu 18.04). As a result, ninja crashes with the unhelpful single line of output:

Fatal: posix_spawn: No such file or directory

I've been doing CI builds in a chroot environment and did not have /dev/null bound into the chroot. This worked for several weeks, and then randomly stopped working a few days after rebuilding the chroot filesystem. Weirdly, ninja was able to build everything under the all target, and would randomly fail during jobs under the lint target. I'm not sure what really changed, maybe a glibc update?

I'm not sure what the best thing to do is here, but one suggestion is check if /dev/null exists before posix_spawn and error out with a more helpful error message.

I understand that the lack of determinism isn't really a "ninja problem", but since this is apparently the reality it might be more friendly to someone in the future to avoid a hard-to-debug failure scenario.

jonesmz commented 4 years ago

Can you try modifying https://github.com/ninja-build/ninja/blob/master/src/subprocess-posix.cc#L87

Such that instead of calling

posix_spawn_file_actions_addopen(&action, 0, "/dev/null", O_RDONLY, 0);

The code calls posix_spawn_file_actions_addclose(&action, 0);

and then see if you are able to reproduce?

I have to admit, I don't know if that will work correctly, as I didn't research the situation thoroughly, but it seems silly to me to have the stdin file descriptor pointing to /dev/null instead of just not being set to anything.

https://linux.die.net/man/3/posix_spawn_file_actions_addopen

cheshirekow commented 4 years ago

I will try to find time to test this and report back.

jonesmz commented 2 years ago

Were you ever able to experiment with this?

cheshirekow commented 2 years ago

Oh I think I did try this but I do not recall the outcome. Sorry, it has been too long and I no longer work on that build system.

Since no one else has come looking, it seems a pretty esoteric thing to hit. You can probably safely close this issue.