rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.33k stars 12.58k forks source link

[discussion] `ErrorKind::FilesystemLoop` from `io_error_more` #130188

Open GrigorenkoPV opened 3 weeks ago

GrigorenkoPV commented 3 weeks ago

@rustbot label C-discussion

Main tracking issue: #86442

Background

The io_error_more feature introduced 21 new variants into ErrorKind. They were FCP'd back in December 2022, but there appeared to be quite a lot of disagreement about 4 of the added variants, so the stabilization (#106375) got stalled for over twenty months. Thankfully, the 17 uncontroversial variants got stabilized in #128316, so now we just need to iron out a satisfactory design for the remaining 4 variants, and then they can be stabilized too.

In order to not block any of the remaining variants on each other and to not intertwine the discussions, I've created 4 separate issues, which summarize the concerns & suggestions voiced up until this point and can serve as a place for further discussion.

FilesystemLoop

Currently corresponds to ELOOP on Unix and ~nothing~ ERROR_CANT_RESOLVE_FILENAME on Windows. (https://github.com/rust-lang/rust/issues/86442#issuecomment-1235763183, #130207)

Current docs description:

Loop in the filesystem or IO subsystem; often, too many levels of symbolic links.

There was a loop (or excessively long chain) resolving a filesystem object or file IO object.

On Unix this is usually the result of a symbolic link loop; or, of exceeding the system-specific limit on the depth of symlink traversal.

~Make it correspond to ERROR_CANT_RESOLVE_FILENAME on Windows~

Done in #130207

Old description > for `ELOOP`, Windows appears to give `winapi::shared::winerror::ERROR_CANT_RESOLVE_FILENAME` in similar situations (e.g. symlink loops). Could we add that in, or perhaps generalise `FileSystemLoop` to the slightly more general case of being unable to resolve? _Originally posted by Robert Collins in https://github.com/rust-lang/rust/issues/86442#issuecomment-1328334824_ In https://github.com/rust-lang/rust/issues/86442#issuecomment-1360188402 Ian Jackson voices a concern that this might not be the only place where `ERROR_CANT_RESOLVE_FILENAME` appears. Chris Denton in https://github.com/rust-lang/rust/issues/86442#issuecomment-1360288630 and Robert Collins in https://github.com/rust-lang/rust/issues/86442#issuecomment-1367167550 confirm that this is the only place where Windows currently gives `ERROR_CANT_RESOLVE_FILENAME` and that there is a good correspondence with Unix's `ELOOP` (when it comes to symlikns, see below for the other usages of `ELOOP`). Ian Jackson agrees with them in https://github.com/rust-lang/rust/pull/106375#issuecomment-1369656136, but proposes this should be done separately from stabilization. There seems to be a consensus regarding this point.

Bikshed the name: be about loops in general, drop "filesystem" from the name

Unix's ELOOP is not just for symlink loops (or too long symlink chains).

ELOOP itself isn't returned solely when loops are detected. Add to that list mount(2) returning ELOOP for move operations where the target is a child of the source - something that has absolutely nothing to do with symlinks, and execve returning ELOOP for exceeding recursion limits during recursive script execution (since Linux 3.8).

  • because OS errors are moving targets, we cannot assume Linux / BSD / others will not introduce a 5th or 6th meaning, and its clear to me at least that Linux doesn't treat ELOOP as a filesystem error but a more general error.

I suggest renaming it to LoopError, but document that it means ELOOP on Linux and ERROR_CANT_RESOLVE_FILENAME on Windows, and either describe what we know right now, or provide breadcrumbs for readers to catch up.

Originally posted by Robert Collins in https://github.com/rust-lang/rust/issues/86442#issuecomment-1367167550

I have a mild preference for renaming FilesystemLoop to something that doesn't include Filesystem, for the same reason: OSes do use it for other errors. For instance, Linux also uses it for keyrings, BPF, network routing/filtering, vhost, and network bridges.

Originally posted by Josh Triplett in https://github.com/rust-lang/rust/issues/106375#issuecomment-1371870620

I disagree with renaming FilesystemLoop.

It is true that Unix has a tendency to reuse errno values, so that any particular errno value can often mean a variety of things. Particularly, less-common (even, obscure) APIs and facilities (ab)use errno values. Attempting to represent all these obscure possibilities leads to descriptions and categorisations that are vague and overlapping. We generally haven't done that and I don't think we should start now. (All of this was discussed at length in the earlier conversations in the tracking issue.)

The APIs available in std will produce this error for filesystem operations, not obscure other purposes. I think calling it FilesystemLoop is sensible.

Originally posted by Ian Jackson in https://github.com/rust-lang/rust/pull/106375#issuecomment-1372131054

Bikeshed the name: be about symlink resolution failure in general, stop mentioning loops

some system calls on Linux also use ELOOP to mean "ELOOP A loop exists in symbolic links encountered during resolution of the path argument, or O_NOFOLLOW was specified and the path argument names a symbolic link." so I think interpreting it as "symlink loop or similar symlink resolve error was encountered" might be an accurate description, although (bike-shedding!) I don't know if FilesystemLoop is an accurate name then, and not something like SymlinkResolutionFailed or such...

Originally posted by Alain Emilia Anna Zscheile in https://github.com/rust-lang/rust/issues/86442#issuecomment-1360459049

Skgland commented 3 weeks ago

winapi::shared::winerror::Ian Jackson loooks to be a copy-paste error?

GrigorenkoPV commented 3 weeks ago

Yeah, thanks, fixed it

ChrisDenton commented 3 weeks ago

Given the previous discussion, I have a mild preference for renaming this to LinkNotResolved (or something like that) as the error does seem more generic than just loops, even on Unix. Though I do get the argument for using Unix error names even if they're not entirely accurate.

Amanieu commented 2 weeks ago

We discussed this in the libs-api meeting. For now we're considering not stabilizing this until we get a concrete use case for why a user would want to match on FilesystemLoop (or whatever we rename it to) other then to print an error message (for which we already have a mechanism).

dtolnay commented 2 weeks ago

https://github.com/rust-lang/rust/issues/86442#issuecomment-1129269525 allegedly has a use case. I think it's this: https://github.com/rust-lang/rust/issues/86442#issuecomment-1129050835 "mocking APIs, testing error handling and various other things"

@lucacasonato

dtolnay commented 2 weeks ago

Would something like UnresolvableIndirection be appropriate? "Unresolvable" — it doesn't say that you are specifically dealing with a loop. Maybe the system hit some limit trying to determine whether there is a loop, or maybe there is definitely no loop but you used O_NOFOLLOW. "Indirection" — it doesn't say that you are specifically dealing with symlinks or a filesystem. Maybe it's a network route, or keyring. But it captures the gist of all the usages described above.

GrigorenkoPV commented 2 weeks ago

#86442 (comment) allegedly has a use case. I think it's this: #86442 (comment) "mocking APIs, testing error handling and various other things"

From what I can tell, that guy's complaint was "it is possible to get this error from std, yet I cannot produce it myself", which would apply to any unstable ErrorKind and is not really a concrete use-case.