ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.68k stars 2.47k forks source link

`posix.fstatat` fails to propogate `SYMLINK_NOFOLLOW` when linking `wasi-libc` #20890

Open squeek502 opened 1 month ago

squeek502 commented 1 month ago

Zig Version

0.14.0-dev.653+91c17979f

Steps to Reproduce and Observed Behavior

const std = @import("std");

test "fstatat on a symlink with SYMLINK_NOFOLLOW" {
    var tmp = std.testing.tmpDir(.{});
    defer tmp.cleanup();

    const testfile = try tmp.dir.createFile("testfile", .{});
    testfile.close();

    try tmp.dir.symLink("testfile", "testsymlink", .{});

    const stat = try std.posix.fstatat(tmp.dir.fd, "testsymlink", std.posix.AT.SYMLINK_NOFOLLOW);

    std.testing.expect(stat.mode & std.posix.S.IFLNK == std.posix.S.IFLNK) catch |err| {
        std.debug.print("stat.mode={X}\n", .{stat.mode});
        return err;
    };
}

This test passes normally and when targeting wasm32-wasi and not linking libc:

$ zig test stat-symlink-test.zig -target wasm32-wasi -femit-bin=stat-symlink-test.wasm --test-no-exec
$ wasmtime --dir=. stat-symlink-test.wasm
All 1 tests passed.

But fails when linking libc (meaning wasi-libc):

$ wasmtime --dir=. stat-symlink-test-libc.wasm
stat.mode=8000
1/1 stat-symlink-test.test.fstatat on a symlink with SYMLINK_NOFOLLOW...FAIL (TestUnexpectedResult)
Unable to dump stack trace: not implemented for Wasm
0 passed; 0 skipped; 1 failed.

(that mode is IFREG aka a file)

This is not a problem with the parameters being passed to fstatat: I confirmed that the AT_SYMLINK_NOFOLLOW is being passed to the fstatat_sym call here:

https://github.com/ziglang/zig/blob/059856acfc9f87d723a90af6a4214e128b8cae2e/lib/std/posix.zig#L4382

But strace shows that the NOFOLLOW flag is being dropped when going through wasmtime (the openat2 call should have O_NOFOLLOW, which it does when not linking wasi-libc):

openat2(13, "testsymlink", {flags=O_RDONLY|O_CLOEXEC|O_PATH, resolve=RESOLVE_NO_MAGICLINKS|RESOLVE_BENEATH}, 24) = 11
statx(11, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0664, stx_size=0, ...}) = 0
close(11)                               = 0

My first thought was that this is a wasi-libc/wasmtime bug, similar to https://github.com/ziglang/zig/issues/20747, but I have been unable to find a reproduction when building an intended-to-be-equivalent C version via wasi-sdk. Here's the program I'm trying (I'm using wasi-sdk 23.0 and wasmtime 23.0.1):

#include <sys/stat.h>
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

int main() {
    mkdirat(AT_FDCWD, "testdir", 0777);
    int dirfd = openat(AT_FDCWD, "testdir", O_RDONLY|O_DIRECTORY);

    int fd = openat(dirfd, "testfile", O_RDONLY|O_CREAT);
    close(fd);

    symlinkat("testfile", dirfd, "testsymlink");

    struct stat st;
    fstatat(dirfd, "testsymlink", &st, AT_SYMLINK_NOFOLLOW);

    printf("%X\n", st.st_mode);

    if ((st.st_mode & S_IFLNK) == 0) return 1;
    return 0;
}

Built with:

$ WASI_SDK=/home/ryan/Downloads/wasi-sdk-23.0-x86_64-linux
$ $WASI_SDK/bin/clang --sysroot=$WASI_SDK/share/wasi-sysroot stat-symlink.c -o stat-symlink-sdk.wasm

And run with:

$ wasmtime --dir=. stat-symlink-sdk.wasm
A000

(0xA000 is S_IFLNK aka symlink)

Expected Behavior

The test to pass when linking wasi-libc

rootbeer commented 1 month ago

I've got a change with a bunch of WASI with-or-without-libc fixes coming soon, I'll see if I can fold this in.

rootbeer commented 1 month ago

I think the problem is that AT_SYMLINK_NOFOLLOW should be 0x1 on wasi-with-libc, but its 0x100 in Zig? I'm not entirely sure where wasi-libc defines these constants, but I see an 0x1 in here: https://github.com/WebAssembly/wasi-libc/blob/230d4be6c54bec93181050f9e25c87150506bdd0/expected/wasm32-wasip1/predefined-macros.txt#L52 (I see the 0x100 over here, but I think that's a vanilla musl header?)

My test show the 0x1 works, which bolsters my case, but I'm curious to know what the actual source of truth for these constants is. ..

squeek502 commented 1 month ago

This might be relevant if Zig is calling into the "bottom half" of wasi-libc:

https://github.com/WebAssembly/wasi-libc/blob/main/libc-bottom-half/headers/public/__header_fcntl.h#L55 https://github.com/WebAssembly/wasi-libc/tree/main/libc-bottom-half

rootbeer commented 3 weeks ago

I believe I've found the header that defines the constants that wasi-libc expects its clients to use: lib/libc/include/wasm-wasi-musl/__header_fcntl.h. (This header is included by fcntl.h in the same directory and via a bunch of indirect pre-processing, overrides the default constants defined in fcntl.h, so someone who just glances at fcntl.h might see the wrong constants.)

The directory is included by wasm_libc.zig as the sysroot (the triple is wasm-wasi-musl):

        "-iwithsysroot",
        try comp.zig_lib_directory.join(arena, &[_][]const u8{ "libc", "include", triple }),

The "bottom-half" headers I linked to earlier, as I understand it, are for the wasi-libc implementation to talk to WASI APIs, so its not directly relevant here -- though it seems like many of the constants are defined to match across the two layers.

So, according to this header AT_SYMLINK_{FOLLOW,NOFOLLOW} should be 0x2 and 0x1 respectively.

I have fixes for this in #20991 (along with some testcases that exercise these values).

If you have a pure C compilation environment set up for WASI, you might trying printfing these various constants to see which values they map to?