BurntSushi / walkdir

Rust library for walking directories recursively.
The Unlicense
1.22k stars 107 forks source link

Endless loop when ReadDir returns an error #98

Closed codido closed 5 years ago

codido commented 6 years ago

The ReadDir documentation states that it may return an Err if there's some sort of intermittent IO error during iteration. However, it seems that there are some cases in which the error will be persistent, for instance, some directories in a /proc entry of a zombie process on Linux.

The entry wont be popped in this case, resulting in an endless loop:

$ cargo run --example walkdir /proc/2226/
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/examples/walkdir /proc/2226/`
/proc/2226/
/proc/2226/task
/proc/2226/task/2226
/proc/2226/task/2226/fd
ERROR: IO error for operation on /proc/2226/task/2226/fd: Permission denied (os error 13)
/proc/2226/task/2226/fdinfo
ERROR: IO error for operation on /proc/2226/task/2226/fdinfo: Permission denied (os error 13)
/proc/2226/task/2226/ns
ERROR: IO error for operation on /proc/2226/task/2226/ns: Permission denied (os error 13)
/proc/2226/task/2226/net
ERROR: Invalid argument (os error 22)
ERROR: Invalid argument (os error 22)
ERROR: Invalid argument (os error 22)
ERROR: Invalid argument (os error 22)
ERROR: Invalid argument (os error 22)
ERROR: Invalid argument (os error 22)
ERROR: Invalid argument (os error 22)
ERROR: Invalid argument (os error 22)
ERROR: Invalid argument (os error 22)
...
BurntSushi commented 6 years ago

Can you propose a fix? What do other directory walkers do? How does find behave?

codido commented 6 years ago

The old find implementation (preceding fts) actually experienced the same issue - it would continue calling readdir() on the same file descriptor and would result in an endless loop.

The new implementation seems to stop traversal at that point, as if end-of-directory was reached, but the entry is flagged with the appropriate error. This means that it wont deal with recoverable file system errors, but not sure if that could actually happen.

This could be mitigated by returning the error in DirList::next() as done today, but making sure the next iteration returns None. This can be combined with a few retries, but not sure that's actually needed.

By the way (and completely unrelated), while looking at this I noticed that ReadDir (in std) uses readdir for solaris & fuchsia but readdir_r on Linux. However, READDIR_R(3) seems to suggest that readdir should be used on Linux as well. Any idea about that?

BurntSushi commented 6 years ago

Interesting. I don't know about readdir_r/readdir. It may just be an oversight. Probably the easiest way to find out is to submit a PR that changes it. :-)

codido commented 6 years ago

Regarding readdir_r/readdir, seems like there's an open issue for that already (https://github.com/rust-lang/rust/issues/34668)

sharkdp commented 6 years ago

This is probably related to https://github.com/rust-lang/rust/issues/50619

BurntSushi commented 5 years ago

This bug should be fixed in the current beta release, which means it should be fixed in the Rust 1.29 stable release.

sanmai-NL commented 5 years ago

@codido: do you have the opportunity to test whether @BurntSushi is right in his last message? If so, can you please close the issue?

codido commented 5 years ago

@sanmai-NL Just gave it a quick shot and certainly seems to be resolved, thanks!