BurntSushi / walkdir

Rust library for walking directories recursively.
The Unlicense
1.22k stars 107 forks source link

Implement Clone for WalkDir #54

Closed KodrAus closed 6 years ago

KodrAus commented 7 years ago

Relevant API guideline

From the discussion: The traits to implement for WalkDir are:

Raised by @epage on reddit. The walkdir type should implement some common traits. They might not all make sense, but I think:

  • Debug
  • Clone
  • Eq was also mentioned. I'm not sure what the use-case for this one is though, maybe @epage can elaborate more.

Any other thoughts on this one?

epage commented 7 years ago

While Clone is more obvious, I'll still mention my use cases:

My interest in Eq is more speculative. cobalt is a static site generator. It has a live-update mode for testing out changes. At a later point, I'm planning to support partial updates rather than rebuilding the entire site. For some cases, I'll need to reload the configuration, see what has changed, and re-run the parts that have changed. I'm unsure if I'll be calling Eq on WalkDir or on a custom data structure for this purpose.

KodrAus commented 7 years ago

Thanks @epage. I'll add this to the tracking issue :+1:

tmccombs commented 7 years ago

With the current implementation, I don't think this is possible, because WalkDirOptions contains an Option<Box<FnMut(...) -> Ordering + 'static>> Maybe it would be possible if FnMut was weakened to Fn, but then you would run into https://github.com/rust-lang/rust/issues/24000

BurntSushi commented 7 years ago

I strongly think the walkdir iterator should not impl Eq.

Clone seems fine if it's doable. Can we put the closure into an Arc?

epage commented 7 years ago

I strongly think the walkdir iterator should not impl Eq.

To be clear, my request is for Eq to be implemented for WalkDir and not its iterator. If you still feel that way, I'm curious why. My interest isn't as much to push strongly on this one point (as I mentioned, my interest is speculative) but to (1) gain insight and (2) help with libz blitz decisions and (3) ensure its recorded so hopefully people don't bring this up again.

What is the intention behind sorter being an FnMut? I'm trying to better understand so I can see how Arc plays out with my suggested use cases.

BurntSushi commented 7 years ago

If you still feel that way, I'm curious why.

I guess I should re-word my thoughts:

  1. What are the specific use cases for comparing two WalkDir values for equality?
  2. What does equality actually mean? Is it just comparing the configuration? If so, then you really can't see through a closure for equality, so I'm not sure how that would work. Or does equality compare the set of directories and files yielded by the configuration? If the latter, are only the directory and file names compared, or are the contents considered as well? How are errors handled?

I think (1) is the most important. If we have (1), then perhaps that will yield answers to (2).

What is the intention behind sorter being an FnMut? I'm trying to better understand so I can see how Arc plays out with my suggested use cases.

Generally speaking, FnMut should probably be the default closure type you reach for. It's the least restrictive from the caller's perspective. It is also consistent with the sort_by method on slices inside std.

epage commented 7 years ago

One of the features of cobalt (static site generator) is that it can live update a preview of a site. Currently, when it detects any change, it completely rebuilds the site. Something we're interested in eventually implementing is the ability to do partial rebuilds. One way I'm considering doing this is to load the configuration and rebuild the site for any part of the configuration that is different (another mechanism would handle re-walking if files are deleted / added).

I'm not entirely sure yet if I'd be doing equality checks on WalkDir to see if the config has changed or a more application specific data type.

To put it another way for WalkDir equality to be on the config and not on the filesystem: I at least think of WalkDir as the builder and the iter as the built object. From that, equality would be on its role as a builder and not as some abstract representation of a filesystem. If someone wants that, they should probably use dir-diff.

If so, then you really can't see through a closure for equality, so I'm not sure how that would work.

I find this interesting feedback into the language. In some respects, closures are syntactic sugar for creating a callable struct. The captured values are effectively members of the struct. Unfortunately, there isn't a way to mark a closure as Clone or Eq. The only way to do things like that are for someone to hand-roll the callable struct rather than leveraging the syntactic sugar. I expect this to get worse as coroutines are added to the language.

C++'s way of handling this is the most ideal (allow a wrapper object to only expose a "trait" if an inner object exposes it) but that only works because of the lack of concepts/traits + SFINAE. It'd be hard to translate that over to Rust.

BurntSushi commented 7 years ago

Unfortunately, there isn't a way to mark a closure as Clone or Eq.

Even if you could, it would require enforcing that all captured values also satisfy Eq, which seems onerous.

I think I'd suggest building an application specific type for your use case.

laumann commented 6 years ago

I think we just established that even though WalkDir could implement Clone, it adds an undesirable overhead, because we want calls to be able to do mutation inside the closure (so it should remain FnMut and not Fn), and to achieve that it would be necessary to do something like Arc<Mutex<FnMut(...)>>.

So can this be closed then?