Closed corwin-of-amber closed 4 years ago
Yup, looks like a bug to me. The fix to libc-bottom-half/headers/public/__header_fcntl.h
should be fairly straight forward. Will probably involve modifying libc-bottom-half/cloudlibc/src/libc/fcntl/openat.c
too.
Looks like we inherited this bug from cloudabi: https://github.com/NuxiNL/cloudlibc/blob/master/src/include/fcntl.h#L93
We should probably duplicate the bug and/or fix there too.
Of course, any implementation that used to check for a write access flag via flags & O_WRONLY
(vs. flags & (O_WRONLY | O_RDWR)
) would be broken by this. But it would have been broken on any Unix as well.
Strictly speaking, this is not a libc bug, since neither POSIX nor even GNU guarantee O_RDWR
to be defined that way -- on GNU Hurd, O_RDWR
is O_RDONLY|O_WRONLY
. So from the description here, it's a bug in gnulib. I'm not immediately opposed to changing WASI libc here for bug compatibility, but I do want to understand the space better first.
WASI libc does have a working openat
. Can you determine why gnulib thinks it needs to provide its own openat
replacement? That will probably cause other problems, because WASI libc's open
is defined in terms of openat
, so if you override openat
with a replacement that works in terms of open
, that's a recipe for trouble ;-).
It's a valid question and I was asking myself the same thing. I am hardly an expert on anything GNU but I've had a bit of experience compiling a few coreutils programs with wasi-libc.
GNU's openat
in fact wraps the system's openat
, performing a few more checks before and after the call. I am guessing this is in order to provide consistent error-handling and flag semantics across OSs; this particular check was surrounded with a config macro OPEN_TRAILING_SLASH_BUG
, and a detailed comment explaining that on some systems, openat
does not conform with POSIX, which require open
and openat
to fail with EISDIR
in certain cases:
# if OPEN_TRAILING_SLASH_BUG
/* Fail if one of O_CREAT, O_WRONLY, O_RDWR is specified and the filename
ends in a slash, as POSIX says such a filename must name a directory
<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_13>:
"A pathname that contains at least one non-<slash> character and that
ends with one or more trailing <slash> characters shall not be resolved
successfully unless the last pathname component before the trailing
<slash> characters names an existing directory"
If the named file already exists as a directory, then
- if O_CREAT is specified, open() must fail because of the semantics
of O_CREAT,
- if O_WRONLY or O_RDWR is specified, open() must fail because POSIX
<https://pubs.opengroup.org/onlinepubs/9699919799/functions/openat.html>
says that it fails with errno = EISDIR in this case.
If the named file does not exist or does not name a directory, then
- if O_CREAT is specified, open() must fail since open() cannot create
directories,
- if O_WRONLY or O_RDWR is specified, open() must fail because the
file does not contain a '.' directory. */
and a similar one after the call to the original openat
:
# if OPEN_TRAILING_SLASH_BUG
/* If the filename ends in a slash and fd does not refer to a directory,
then fail.
Rationale: POSIX says such a filename must name a directory
<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_13>:
"A pathname that contains at least one non-<slash> character and that
ends with one or more trailing <slash> characters shall not be resolved
successfully unless the last pathname component before the trailing
<slash> characters names an existing directory"
If the named file without the slash is not a directory, open() must fail
with ENOTDIR. */
BTW gnulib provides wrappers for many other system calls. This is just the first with which I had trouble so far.
Ah, ok. I had glanced at the gnulib code and misinterpreted what it was doing. So then my question is, is OPEN_TRAILING_SLASH_BUG
defined in your build? If so, it seems like we should fix WASI and/or WASI libc to not have this bug :-).
Definitely, for this particular instance, setting OPEN_TRAILING_SLASH_BUG
to 0 solves the problem. This is pure coincidence though. The reason is was defined as 1 in the first place is not WASI libc's fault at all, but rather Apple's. 🍎 😄 Here's the thing: running configure
directly with wasi-sdk is impractical, because configure
typically compiles and runs C programs. There are just too many opportunities for things to go wrong even if you somehow set it up so that the resulting executables are launched with e.g. wasmtime. For this reason, I run configure with my native clang
and then compile the configured sources with wasi-sdk. Occasionally, like in this case, I have to go and manually change a config macro. It's messy and error-prone, but it's the best I managed to do.
So, that aside, I am guessing for the sole purpose of making GNU's openat
wrapper work, they should just fix the code. I cannot attest to seeing other places where the specific bitwise values of the flags mattered. I think it is customary in Unix that each flag is a bit, but you did mention Hurd where this is also not the case. But in particular having a flag called O_WRONLY
should mean that if it's present, then the file should be write only. The combination O_RDONLY | O_WRONLY
is self-contradicting.
It turns out that it's customary that they don't get their own bits. At a quick survey, all platforms I can find other than Hurd and CloudABI define these bits like this:
#define O_RDONLY 00
#define O_WRONLY 01
#define O_RDWR 02
So that code in gnulib only works by being lucky that it's testing for writeability and not readability, which wouldn't work that way.
You're right about the "only". Hurd for its part defines O_READ
and O_WRITE
as aliases for O_RDONLY
and O_WRONLY
to address that. If we keep O_RDWR
as it is, perhaps we should do that too. Other than this naming issue, having independent bits for reading and writing is conceptually cleaner.
I think one can make the argument that enough platforms use 0, 1, and 2 that it's better to be bug-compatible with programs that accidentally depend on that than it is to stand on principle here, particularly since these flags are just in libc, and not WASI itself.
I don't have a sense of which way to go yet. Thoughts?
Ok, if Hurd defines it that way and Hurd is, in fact, part of the GNU family, then they definitely should fix their openat
implementation. gnulib works on Hurd because Hurd apparently does not have the trailing slash bug (being POSIX-compliant). I'm not sure whether wasi-libc has it, have to check a few benchmark cases. But if it's fixed, then a better approach perhaps would be to be able to run configure
properly so that this flag is set to 0 and users don't encounter this situation.
I've now reported the bug to Gnulib: https://lists.gnu.org/archive/html/bug-gnulib/2020-03/msg00000.html.
It's now fixed in gnulib: http://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=1d86584739a712597a6fe3a62ec412238f0f13f8
(Though note that even before this fix, it took an unconventional configuration to observe the probl in WASI.)
As discussed above, code which assumes O_RDONLY and O_WRONLY are independent bits is already on thin ice for O_WRONLY and broken for O_RDONLY on most platforms
If we get more reports about this, we can reevaluate, but for now I think we can close this.
It seems that GNU programs do not expect
O_RDONLY
,O_WRONLY
, andO_RDWR
to share bits. This is demonstrated by this excerpt from gnulib'sopenat.c
:This code is meant to check for a condition where write access is required but the filename ends with a slash, indicating that it should be a directory. However, with wasi-libc, if a directory was opened with read access (
flags == O_RDONLY
), the value offlags & (O_CREAT | O_WRONLY | O_RDWR)
is still nonzero since the definition ofO_RDWR
is: https://github.com/CraneStation/wasi-libc/blob/446cb3f1aa21f9b1a1eab372f82d65d19003e924/libc-bottom-half/headers/public/__header_fcntl.h#L37