WebAssembly / wasi-libc

WASI libc implementation for WebAssembly
https://wasi.dev
Other
860 stars 201 forks source link

Trouble with the definition of O_RDWR #163

Closed corwin-of-amber closed 4 years ago

corwin-of-amber commented 4 years ago

It seems that GNU programs do not expect O_RDONLY, O_WRONLY, and O_RDWR to share bits. This is demonstrated by this excerpt from gnulib's openat.c:

  if (flags & (O_CREAT | O_WRONLY | O_RDWR))
    {
      size_t len = strlen (filename);
      if (len > 0 && filename[len - 1] == '/')
        {
          errno = EISDIR;
          return -1;
        }
    }

This code is meant to check for a condition where write access is required but the filename ends with a slash, indicating that it should be a directory. However, with wasi-libc, if a directory was opened with read access (flags == O_RDONLY), the value of flags & (O_CREAT | O_WRONLY | O_RDWR) is still nonzero since the definition of O_RDWR is: https://github.com/CraneStation/wasi-libc/blob/446cb3f1aa21f9b1a1eab372f82d65d19003e924/libc-bottom-half/headers/public/__header_fcntl.h#L37

sbc100 commented 4 years ago

Yup, looks like a bug to me. The fix to libc-bottom-half/headers/public/__header_fcntl.h should be fairly straight forward. Will probably involve modifying libc-bottom-half/cloudlibc/src/libc/fcntl/openat.c too.

sbc100 commented 4 years ago

Looks like we inherited this bug from cloudabi: https://github.com/NuxiNL/cloudlibc/blob/master/src/include/fcntl.h#L93

We should probably duplicate the bug and/or fix there too.

corwin-of-amber commented 4 years ago

Of course, any implementation that used to check for a write access flag via flags & O_WRONLY (vs. flags & (O_WRONLY | O_RDWR)) would be broken by this. But it would have been broken on any Unix as well.

sunfishcode commented 4 years ago

Strictly speaking, this is not a libc bug, since neither POSIX nor even GNU guarantee O_RDWR to be defined that way -- on GNU Hurd, O_RDWR is O_RDONLY|O_WRONLY. So from the description here, it's a bug in gnulib. I'm not immediately opposed to changing WASI libc here for bug compatibility, but I do want to understand the space better first.

WASI libc does have a working openat. Can you determine why gnulib thinks it needs to provide its own openat replacement? That will probably cause other problems, because WASI libc's open is defined in terms of openat, so if you override openat with a replacement that works in terms of open, that's a recipe for trouble ;-).

corwin-of-amber commented 4 years ago

It's a valid question and I was asking myself the same thing. I am hardly an expert on anything GNU but I've had a bit of experience compiling a few coreutils programs with wasi-libc.

GNU's openat in fact wraps the system's openat, performing a few more checks before and after the call. I am guessing this is in order to provide consistent error-handling and flag semantics across OSs; this particular check was surrounded with a config macro OPEN_TRAILING_SLASH_BUG, and a detailed comment explaining that on some systems, openat does not conform with POSIX, which require open and openat to fail with EISDIR in certain cases:

# if OPEN_TRAILING_SLASH_BUG
  /* Fail if one of O_CREAT, O_WRONLY, O_RDWR is specified and the filename
     ends in a slash, as POSIX says such a filename must name a directory
     <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_13>:
       "A pathname that contains at least one non-<slash> character and that
        ends with one or more trailing <slash> characters shall not be resolved
        successfully unless the last pathname component before the trailing
        <slash> characters names an existing directory"
     If the named file already exists as a directory, then
       - if O_CREAT is specified, open() must fail because of the semantics
         of O_CREAT,
       - if O_WRONLY or O_RDWR is specified, open() must fail because POSIX
         <https://pubs.opengroup.org/onlinepubs/9699919799/functions/openat.html>
         says that it fails with errno = EISDIR in this case.
     If the named file does not exist or does not name a directory, then
       - if O_CREAT is specified, open() must fail since open() cannot create
         directories,
       - if O_WRONLY or O_RDWR is specified, open() must fail because the
         file does not contain a '.' directory.  */

and a similar one after the call to the original openat:

# if OPEN_TRAILING_SLASH_BUG
  /* If the filename ends in a slash and fd does not refer to a directory,
     then fail.
     Rationale: POSIX says such a filename must name a directory
     <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_13>:
       "A pathname that contains at least one non-<slash> character and that
        ends with one or more trailing <slash> characters shall not be resolved
        successfully unless the last pathname component before the trailing
        <slash> characters names an existing directory"
     If the named file without the slash is not a directory, open() must fail
     with ENOTDIR.  */
corwin-of-amber commented 4 years ago

BTW gnulib provides wrappers for many other system calls. This is just the first with which I had trouble so far.

sunfishcode commented 4 years ago

Ah, ok. I had glanced at the gnulib code and misinterpreted what it was doing. So then my question is, is OPEN_TRAILING_SLASH_BUG defined in your build? If so, it seems like we should fix WASI and/or WASI libc to not have this bug :-).

corwin-of-amber commented 4 years ago

Definitely, for this particular instance, setting OPEN_TRAILING_SLASH_BUG to 0 solves the problem. This is pure coincidence though. The reason is was defined as 1 in the first place is not WASI libc's fault at all, but rather Apple's. 🍎 😄 Here's the thing: running configure directly with wasi-sdk is impractical, because configure typically compiles and runs C programs. There are just too many opportunities for things to go wrong even if you somehow set it up so that the resulting executables are launched with e.g. wasmtime. For this reason, I run configure with my native clang and then compile the configured sources with wasi-sdk. Occasionally, like in this case, I have to go and manually change a config macro. It's messy and error-prone, but it's the best I managed to do.

So, that aside, I am guessing for the sole purpose of making GNU's openat wrapper work, they should just fix the code. I cannot attest to seeing other places where the specific bitwise values of the flags mattered. I think it is customary in Unix that each flag is a bit, but you did mention Hurd where this is also not the case. But in particular having a flag called O_WRONLY should mean that if it's present, then the file should be write only. The combination O_RDONLY | O_WRONLY is self-contradicting.

sunfishcode commented 4 years ago

It turns out that it's customary that they don't get their own bits. At a quick survey, all platforms I can find other than Hurd and CloudABI define these bits like this:

#define O_RDONLY  00
#define O_WRONLY  01
#define O_RDWR    02

So that code in gnulib only works by being lucky that it's testing for writeability and not readability, which wouldn't work that way.

You're right about the "only". Hurd for its part defines O_READ and O_WRITE as aliases for O_RDONLY and O_WRONLY to address that. If we keep O_RDWR as it is, perhaps we should do that too. Other than this naming issue, having independent bits for reading and writing is conceptually cleaner.

I think one can make the argument that enough platforms use 0, 1, and 2 that it's better to be bug-compatible with programs that accidentally depend on that than it is to stand on principle here, particularly since these flags are just in libc, and not WASI itself.

I don't have a sense of which way to go yet. Thoughts?

corwin-of-amber commented 4 years ago

Ok, if Hurd defines it that way and Hurd is, in fact, part of the GNU family, then they definitely should fix their openat implementation. gnulib works on Hurd because Hurd apparently does not have the trailing slash bug (being POSIX-compliant). I'm not sure whether wasi-libc has it, have to check a few benchmark cases. But if it's fixed, then a better approach perhaps would be to be able to run configure properly so that this flag is set to 0 and users don't encounter this situation.

sunfishcode commented 4 years ago

I've now reported the bug to Gnulib: https://lists.gnu.org/archive/html/bug-gnulib/2020-03/msg00000.html.

sunfishcode commented 4 years ago

It's now fixed in gnulib: http://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=1d86584739a712597a6fe3a62ec412238f0f13f8

(Though note that even before this fix, it took an unconventional configuration to observe the probl in WASI.)

As discussed above, code which assumes O_RDONLY and O_WRONLY are independent bits is already on thin ice for O_WRONLY and broken for O_RDONLY on most platforms

If we get more reports about this, we can reevaluate, but for now I think we can close this.