Closed squeek502 closed 2 years ago
After a bit of investigation, I think a reasonable change here might be to just add
.NOENT => return null,
to the .linux
implementation of Iterator.next
.
Some reasoning:
ENOENT
being returned seems to be Linux-specific; it is not a possible (or at least documented) return on FreeBSD, Mac, etcwine
and got FileBusy
from the deleteTree
call, so it might not affect Windows in the same way as it might be harder/impossible to delete the directory while iterating it (haven't tested on a real Windows machine yet though)getdents
returned 0
here which Iterator.next
treats as end-of-iteration and returns null (EDIT: for the sake of accuracy, it actually returns the entries for .
and ..
[which get skipped] before returning 0
).NOENT => return error.DirNotFound,
to the Linux switch and DirNotFound
to the IteratorError
error set just to see what it'd be like, and found that in all of the (few) cases in the standard library that used Iterator.next
, it made sense to turn the next
call into something like:iter.next() catch |err| switch (err) {
error.DirNotFound => null,
else => |e| return e,
}
which is exactly what returning null from within Iterator.next
would do.
I think your reasoning for returning null makes sense and should go in a PR.
If we return a DirNotFound
error, then the application can always turn that error into a null
, but if we return null
, any application that would care whether the directory was removed can't do the reverse.
I can imagine applications that might care about this distinction. For example, say an application is reading config files from a directory. If that directory gets removed while it's iterating, rather than continuing on like it's finished with all the files, it may want to stop and assert an error. (I would want this behavior in my zigwin32
project that reads JSON files to generate the zig bindings).
It would also be problematic for programs that do want to handle this case differently to detect it afterwards. This would mean you'd always have to check whether the directory still exists after you're done iterating over it (an extra check for the "happy path") and this check will always be a race condition since the directory could have been removed and recreated. The only correct solution they would have is to use the lower-level APIs to iterate over the directory.
Actually now that I've thought about it, I would think that in most cases you'd probably want to assert an error if a directory is removed while you're iterating over it. In zig for example, if you're iterating over the files in a cache directory to copy, then you wouldn't want to continue like nothing happened if the source directory was suddenly removed.
@marler8997 I would normally agree with you, and had similar thoughts initially, but AFAIK the APIs in std.fs.Dir
are meant to behave similarly on different platforms and AFAIK, detecting this condition can only happen on Linux/WASI. On non-Linux UNIX platforms, ENOENT
is not a possible error, and getdents
just returns 0 if the directory being iterated is deleted (only tested on FreeBSD, other platforms might behave differently).
So, if you wrote some code that handles DirNotFound
in some particular way, you'd only get that behavior on Linux, and on non-Linux platforms you'd just get a silent end-of-iteration.
I figured that making Linux behave like non-Linux was the better route than trying to get non-Linux to behave like Linux (which would involve something like an exists
check after any 0
return of getdents
).
Some systems will have extra error codes that others do not. When this is the case, I think it's better to return the error when possible rather than ignore it by default. I believe this is what the rest of the std library does and I think it's the right choice in general.
Maybe you missed it, but I addressed the solution you suggested where you check whether the directory still exists afterwards. This solution is a race condition since the directory can be deleted/re-created. After I had thought on it, it seemed like most programs (or at least many) would want to handle the case where the directory is deleted as a non-happy path. With this API, all cases that want the "correct behavior" are forced to duplicate their directory traversal code to use the lower-level API for any systems that can detect this error and fallback to the higher level API otherwise.
I agree with marler: returning null
is a potentially misleading result, and an error is more appropriate here.
I can see pros/cons to both approaches, and this general topic seems to be something that remains unresolved in terms of how the std.fs
API should be designed. Some more relevant issues/discussions on the same theme:
I personally still lean towards returning null in this instance, though:
try
with Iterator.next
. To be consistent across platforms you'd basically always have to catch/handle DirNotFound
DirNotFound
without converting it to null would only actually function on Linux, which (imo) reduces the usefulness of the errorOne potential compromise might be to split next
into next
and nextLinux
as is done for the .macos, .ios, .freebsd, ...
switch case of Iterator
, where nextLinux
would return DirNotFound
, and then next
converts it to null. That way, if you're on Linux, you could call nextLinux
directly if you want to handle the error, but otherwise you could just call next
to get consistent cross-platform behavior.
Zig Version
master
Steps to Reproduce
getdents
can returnENOENT
if the directory referred to by the fd is deleted before the nextgetdents
call (but after the fd is opened). Currently this causes an unexpected error to be returned fromIterableDir.Iterator.next
.Relevant code for the Linux implementation of
Iterator.next
: https://github.com/ziglang/zig/blob/a2ab9e36faded9755ecc1fe809c49140120c3c61/lib/std/fs.zig#L604-L612Contrived test case (note that I've run into this in a non-contrived use case, though):
Expected Behavior
ENOENT
to be handled in some way byIterator.next
(unsure in what way exactly, but it shouldn't be unreachable or an unexpected error).Note: I've only tested this on Linux, unsure how this manifests on other platforms.
Actual Behavior