ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
35.14k stars 2.56k forks source link

POSIX.1-2024 encourages returning `EILSEQ` for filenames with a newline #21883

Open squeek502 opened 3 weeks ago

squeek502 commented 3 weeks ago

POSIX.1-2024 has added EILSEQ as an error for most filesystem-related functions with the description:

[EILSEQ] The last pathname component of path is not a portable filename, and cannot be created in the target directory.

and added a note in the RATIONALE section:

Implementations are encouraged to have [the function] report an [EILSEQ] error if the last component of path contains any bytes that have the encoded value of a \<newline> character.

(see mkdir for one example, full list of affected funtions available here and that article also talks more about this change here)

This is an application of "Austin Group Defect 251" (https://www.austingroupbugs.net/view.php?id=251) and is a break from the previous 'file paths are just an arbitrary sequences of bytes (minus NUL)' stance

This is relevant to Zig in a few ways:

  1. #15607 details how filesystem-related syscalls can already return EINVAL if the underlying filesystem doesn't support the filename (e.g. | on vfat), which Zig doesn't handle correctly yet
  2. #19005 made WASI handle ILSEQ by converting it to error.InvalidUtf8 because WASI defines filenames to be valid UTF-8 and returns ILSEQ when a filename is not valid UTF-8
    • Note: EILSEQ was previously not a documented possible error for the relevant POSIX APIs, so ILSEQ was assumed to be a WASI-specific return

With (2), this means that we will no longer be able to distinguish between "invalid UTF-8" or "contains newline" on WASI by the error code alone, so .ILSEQ => return error.InvalidUtf8 will no longer be an advisable way to handle ILSEQ in WASI-specific code paths

My current thinking is that we might want to just lump all of these sort of "bad filename" related errors together and let the user attempt to sort it out, e.g. translate both INVAL and ILSEQ to something like error.BadPathName and then have a doc comment that lays out all the possible ways it can be returned on the various platforms

[!NOTE] error.BadPathName was used as the example because it already exists and is used for this sort of purpose already, e.g. there are some existing .INVAL => return error.BadPathName and .OBJECT_NAME_INVALID => return error.BadPathName mappings that partially address #15607 (examples: std.posix.openZ and std.os.windows.OpenFile)


This is almost a duplicate of https://github.com/ziglang/zig/issues/15607, but I thought it might be worth tracking separately so I'm making an issue for it.

alexrp commented 3 weeks ago

My current thinking is that we might want to just lump all of these sort of "bad filename" related errors together and let the user attempt to sort it out, e.g. translate both INVAL and ILSEQ to something like error.BadPathName and then have a doc comment that lays out all the possible ways it can be returned on the various platforms

This seems reasonable to me.