olson-sean-k / wax

Opinionated and portable globs that can be matched against paths and directory trees.
https://glob.guide
MIT License
112 stars 10 forks source link

Remove lifetime generics from `GlobError`? #24

Closed milesj closed 2 years ago

milesj commented 2 years ago

First off, thank you for this library. The glob situation in Rust is... not ideal, and this library is amazing.

With that being said, it would be nice if GlobError and all its children did not use lifetime generics. Wanting to return Result<..., GlobError> from my methods forces me down this rabbit hole of adding lifetimes to all my methods, to any necessary arguments, and in the end just overly complicates everything.

It then puts me in situations like this: glob

Which are sometimes really difficult to work around. Right now my only path forward is to use unwrap(), but that has its own problems.

olson-sean-k commented 2 years ago

First off, thank you for this library.

Thanks, I appreciate it. 😃

The easiest way to avoid propagating and reasoning about GlobError's lifetime parameter is to use the 'static lifetime and clone its data via into_owned:

use wax::Glob;

type GlobError = wax::GlobError<'static>; // For convenience.

// There's no need to forward or propagate a lifetime, because `GlobError` is 'static.
pub fn find(directory: impl AsRef<Path>) -> Result<Vec<PathBuf>, GlobError> {
    let glob = Glob::new(...).map_err(GlobError::into_owned)?; // Clone the data if there is an error.
    for entry in glob.walk(directory, ...) {
        let entry = entry?;
        ...
    }
}

A function like the one above must use into_owned, because the GlobError may refer to the glob expression text used in Glob::new, which is local to the function. If instead the function accepted a glob expression &'t str parameter, then the lifetime of the GlobError would be 't and it would not be strictly necessary to use into_owned.

I've designed Wax to avoid copying data when possible, but that also means that its APIs expose the complexity of borrowing as lifetime parameters. This is why many types have an associated into_owned function. The upshot is that in many cases there is no need to copy anything.

Does this work in your situation?

milesj commented 2 years ago

Awesome, thanks for the quick response.

Let me give this a shot and report back. I'm sure it will work.

milesj commented 2 years ago

Yup that worked, thanks a bunch: https://github.com/moonrepo/moon/pull/105

Another nice to have would be an is_glob fn, which would return true if the string is a glob pattern. src/** -> yes, foo -> no. Right now I'm doing this (pre-wax): https://github.com/moonrepo/moon/pull/105/files#diff-59f9b6fb165685b88f87aed332e7f4f396bb795b2ad3f302f7d7944a36aa7cf6R47

olson-sean-k commented 2 years ago

Great!

Another nice to have would be an is_glob fn

Wax supports a notion called variance, which is somewhat similar to trying to detect whether or not a string is glob-like. In the upcoming 0.5.0 release, you can do something like this:

use wax::{Glob, Pattern};

pub fn is_glob_like(text: &str) -> bool {
    Glob::new(text).ok().map_or(false, |glob| glob.variance().is_variant())
}

A glob is invariant if it is equivalent to a native path as used with platform file system APIs. If a glob is invariant, then its Variance will contain a PathBuf. One thing to note though is that, depending on the platform, some non-literal patterns are considered invariant, such as {same,same}, which is just same. These APIs have already landed on master if you want to check them out. Here's a quick truth table for that example function:

expression glob-like invariant path
foo no foo
foo/bar no foo/bar
foo* yes n/a
[f][o][o] ? foo
{foo,bar} yes n/a
{foo,foo} no foo
(?i)foo ? foo
<a/:1,> yes n/a
<a:3> no aaa

?: depends on the target platform

If you'd like to detect any use of glob syntax, correct or otherwise, you can use the is_meta_character function (and possibly the is_contextual_meta_character function) to detect any characters in a string used for pattern syntax. / is not considered a meta-character.

pub fn is_glob_like(text: &str) -> bool {
    text.chars().any(|x| wax::is_meta_character(x))
}
milesj commented 2 years ago

Love it, thanks for all the info.

olson-sean-k commented 2 years ago

I'm reopening this issue. I agree with the feedback regarding the lifetime parameters in Wax's error types in this thread on Reddit. TL;DR: lifetime parameters are a bad idea in error types.

I plan to remove these lifetime parameters (as originally suggested in this issue). Errors can trivially copy data, which only occurs in the unhappy path and is typically very small in size (no more than hundreds of bytes and most often less). Moreover, some common APIs in the ecosystem require 'static errors. The complexity of the current error API is unusual and unnecessary.

milesj commented 2 years ago

That feedback you received in that reddit thread was so high quality and in depth, I'm impressed.

olson-sean-k commented 2 years ago

The lifetime parameters in public errors have been removed in 813804b. These errors no longer have into_owned functions and client code need not do anything to move these errors beyond a local context (they are always 'static).

Absolucy commented 2 years ago

It'd be nice if a new version could be released, as I'm currently running into lifetime issues with a very simple use-case that is likely related to this:

    let unpack = args
        .unpack
        .as_deref()
        .map(Glob::new)
        .transpose()
        .wrap_err("failed to parse --unpack glob")?;
error[E0597]: `args.unpack` does not live long enough
  --> src\app\pack.rs:12:15
   |
12 |       let unpack = args
   |  __________________^
13 | |         .unpack
14 | |         .as_deref()
   | |                   ^
   | |                   |
   | |___________________borrowed value does not live long enough
   |                     argument requires that `args.unpack` is borrowed for `'static`
...
19 |   }
   |   - `args.unpack` dropped here while still borrowed
olson-sean-k commented 2 years ago

I'd like to address a few more issues before the next release. Apologies for the delay.

Regarding the code that you've shared, I believe you can work around this in the meantime using into_owned as follows.

let unpack = args
    .unpack
    .as_deref()
    .map(|expr| Glob::new(expr).map_err(BuildError::into_owned))
    .transpose()
    .wrap_err("failed to parse --unpack glob")?;

Glob borrows the &str too, so if this code also attempts to move the Glob beyond the lifetime of args.unpack, then it will also need to take ownership via into_owned. This will likely continue to be the case in the next release. In 0.5.0, you can do something like the following.

let unpack = args
    .unpack
    .as_deref()
    .map(|expr| {
        Glob::new(expr)
            .map(Glob::into_owned) // `Glob` borrows `expr`; clone the data.
            .map_err(BuildError::into_owned) // This won't be needed in `0.6.0`.
    })
    .transpose()
    .wrap_err("failed to parse --unpack glob")?;