BurntSushi / walkdir

Rust library for walking directories recursively.
The Unlicense
1.21k stars 106 forks source link

Panicked at 'called `Option::unwrap()` on a `None` value in IntoIter::push #118

Closed jonasbb closed 4 years ago

jonasbb commented 5 years ago

I was testing cargo sweep a bit and encountered this error. Since cargo sweeps seems to simply use the iterator I open this issue here. The relevant entries in the stack frame are 10-12.

I see this problem with v2.2.6 and v2.2.7. I did not test other versions.

$ env RUST_BACKTRACE=1 cargo sweep -d -r -v -i .
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/libcore/option.rs:345:21
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:39
   1: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:70
   2: std::panicking::default_hook::{{closure}}
             at src/libstd/sys_common/backtrace.rs:58
             at src/libstd/panicking.rs:200
   3: std::panicking::default_hook
             at src/libstd/panicking.rs:215
   4: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:478
   5: std::panicking::continue_panic_fmt
             at src/libstd/panicking.rs:385
   6: rust_begin_unwind
             at src/libstd/panicking.rs:312
   7: core::panicking::panic_fmt
             at src/libcore/panicking.rs:85
   8: core::panicking::panic
             at src/libcore/panicking.rs:49
   9: <core::option::Option<T>>::unwrap
             at /rustc/f66e4697ae286985ddefc53c3a047614568458bb/src/libcore/macros.rs:11
  10: walkdir::IntoIter::push
             at /home/jbushart/.cargo/registry/src/github.com-1ecc6299db9ec823/walkdir-2.2.7/src/lib.rs:891
  11: walkdir::IntoIter::handle_entry
             at /home/jbushart/.cargo/registry/src/github.com-1ecc6299db9ec823/walkdir-2.2.7/src/lib.rs:847
  12: <walkdir::IntoIter as core::iter::traits::iterator::Iterator>::next
             at /home/jbushart/.cargo/registry/src/github.com-1ecc6299db9ec823/walkdir-2.2.7/src/lib.rs:712
  13: cargo_sweep::find_cargo_projects
             at src/main.rs:83
  14: cargo_sweep::main
             at src/main.rs:223
  15: std::rt::lang_start::{{closure}}
             at /rustc/f66e4697ae286985ddefc53c3a047614568458bb/src/libstd/rt.rs:64
  16: std::panicking::try::do_call
             at src/libstd/rt.rs:49
             at src/libstd/panicking.rs:297
  17: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:87
  18: std::rt::lang_start_internal
             at src/libstd/panicking.rs:276
             at src/libstd/panic.rs:388
             at src/libstd/rt.rs:48
  19: std::rt::lang_start
             at /rustc/f66e4697ae286985ddefc53c3a047614568458bb/src/libstd/rt.rs:64
  20: main
  21: __libc_start_main
  22: _start
BurntSushi commented 5 years ago

I cannot reproduce this because I don't know what the input to your program is, sorry. Could you please provide more details on how to test this on the same input?

jonasbb commented 5 years ago

I can reliable reproduce the problem on my computer. I don't know how to reproduce it somewhere else, but I thought I should report the bug anyways.

I have a big projects folder at /home/jbushart/projects, where I keep all my git checkouts. It is a mix of all kinds of projects in all kinds of languages, so it is not really possible for me to simplify this right now.

cargo sweep is installed from crates.io. It does not seem to use something fancy during directory walking: https://github.com/holmgr/cargo-sweep/blob/master/src/main.rs#L81

BurntSushi commented 5 years ago

If you run it on a smaller sub-directory---perhaps one that is a checkout of an open project---is there still a problem?

BurntSushi commented 5 years ago

Closing this due to inactivity. I've tried running walkdir over large directories on my system to see if this happens, but no dice.

I'd be happy to look more closely into this with a smaller reproduction. Please also include additional environment details, such as OS and file system.

LukasKalbertodt commented 4 years ago

I think I found the problem.

To repeat: the panic happens in this statement:

https://github.com/BurntSushi/walkdir/blob/f85554d45938bc2c97f348549369e9ceef22e578/src/lib.rs#L898-L900

It happens because self.oldest_opened is larger than self.stack_list.len(), which (according to a comment below) should never happen. But it does. I checked how this can happen and found two ways, one of which seems to be the underlying bug here:


In the method IntoIter::skip_current_dir (which is called by cargo sweep), the stack_list is popped but oldest_opened is not decreased. If we compare that to IntoIter::pop, we see that oldest_opened gets decreased if necessary.

By inserting this line:

self.oldest_opened = min(self.oldest_opened, self.stack_list.len());

... into IntoIter::skip_current_dir after line 781, the bug is fixed for me. (Note: I'm not sure, but maybe skip_current_dir should call self.pop() instead of popping itself?)


The second way how (I think) the oldest_opened can be higher than the stack length is in push() itself:

https://github.com/BurntSushi/walkdir/blob/f85554d45938bc2c97f348549369e9ceef22e578/src/lib.rs#L896-L938

The value is increased in line 907 already, but the stack is only pushed later (line 936). In between the function might return early due to errors. If the user ignores the errors and continues to call next(), this should lead to the same problem. As a solution to this, I would probably postpone increasing oldest_opened until the end of the function, when the stack is actually pushed.

This does not happen for me, but if I understand it correctly it could be a problem in some situations.


@BurntSushi Unfortunately, I still haven't found a way to make this bug easily reproducible for everyone. It just happens with a lot of files and I haven't found a logic behind it. As I do not yet understand the bigger picture behind the stack_list and how it is filled, I cannot create an artificial example. But I hope I described my findings in enough detail so that this makes sense. If you still want to debug for yourself, I can still try to understand the code more to maybe find a reproducible example.

If my explanations seem correct to you, should I open a PR to fix this?

BurntSushi commented 4 years ago

Excellent, thank you. A thorough debugging analysis is fine in lieu of a reproduction, thanks! Although, it seems it should be possible to produce a regression test for this, at least for the first case you identified.

I'm AFK at the moment, but your suggested changes seem reasonable for a PR. I think the best way to fix this is to make sure the state is in sync, so if we can call pop from skip current dir, then that would be nice.

On Wed, Jul 10, 2019, 09:40 Lukas Kalbertodt notifications@github.com wrote:

I think I found the problem.

To repeat: the panic happens in this statement:

https://github.com/BurntSushi/walkdir/blob/f85554d45938bc2c97f348549369e9ceef22e578/src/lib.rs#L898-L900

It happens because self.oldest_opened is larger than self.stack_list.len(), which (according to a comment below) should never happen. But it does. I checked how this can happen and found two ways, one of which seems to be the underlying bug here:

In the method IntoIter::skip_current_dir https://github.com/BurntSushi/walkdir/blob/f85554d45938bc2c97f348549369e9ceef22e578/src/lib.rs#L779-L786 (which is called by cargo sweep), the stack_list is popped but oldest_opened is not decreased. If we compare that to IntoIter::pop https://github.com/BurntSushi/walkdir/blob/f85554d45938bc2c97f348549369e9ceef22e578/src/lib.rs#L940-L949, we see that oldest_opened gets decreased if necessary.

By inserting this line:

self.oldest_opened = min(self.oldest_opened, self.stack_list.len());

... into IntoIter::skip_current_dir after line 781, the bug is fixed for me. (Note: I'm not sure, but maybe skip_current_dir should call self.pop() instead of popping itself?)

The second way how (I think) the oldest_opened can be higher than the stack length is in push() itself:

https://github.com/BurntSushi/walkdir/blob/f85554d45938bc2c97f348549369e9ceef22e578/src/lib.rs#L896-L938

The value is increased in line 907 already, but the stack is only pushed later (line 936). In between the function might return early due to errors. If the user ignores the errors and continues to call next(), this should lead to the same problem. As a solution to this, I would probably postpone increasing oldest_opened until the end of the function, when the stack is actually pushed.

This does not happen for me, but if I understand it correctly it could be a problem in some situations.

@BurntSushi https://github.com/BurntSushi Unfortunately, I still haven't found a way to make this bug easily reproducible for everyone. It just happens with a lot of files and I haven't found a logic behind it. As I do not yet understand the bigger picture behind the stack_list and how it is filled, I cannot create an artificial example. But I hope I described my findings in enough detail so that this makes sense. If you still want to debug for yourself, I can still try to understand the code more to maybe find a reproducible example.

If my explanations seem correct to you, should I open a PR to fix this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BurntSushi/walkdir/issues/118?email_source=notifications&email_token=AADPPYXDPM7WU5S76COG4G3P6XREHA5CNFSM4G4CGAAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZTPV5Y#issuecomment-510065399, or mute the thread https://github.com/notifications/unsubscribe-auth/AADPPYVSRLH4YCHGVDXTEU3P6XREHANCNFSM4G4CGAAA .