olson-sean-k / wax

Opinionated and portable globs that can be matched against paths and directory trees.
https://glob.guide
MIT License
115 stars 10 forks source link

Case sensitivity is not considered in `Glob::partition`. #3

Closed olson-sean-k closed 3 years ago

olson-sean-k commented 3 years ago

I plan to expose flags for controlling case sensitivity in glob expressions. For example, **/*.(?i){jpg,jpeg} would match the extensions jpg or jpeg without case sensitivity (a la typical regular expressions). While working on this, it became clear that case sensitivity must be considered in Glob::partition.

Unix file systems do not consider case or related character classes at all, but Windows file systems do (specifically the Win32 API) and are effectively case insensitive. This means that glob matching may disagree with the resolution of paths done by the target platform and this (along with any future case sensitivity flags) must be reflected by Glob::partition and related APIs.

For example, the glob foo/*.bar is split into the path foo and glob *.bar by Glob::partitioned. However, globs are currently case sensitive, so on Windows this may lead to inconsistent behavior between Glob::new with Glob::is_match versus Glob::partitioned with Glob::walk:

let glob = Glob::new("foo/*.bar").unwrap();
// This is not a match, regardless of platform. Globs are case sensitive everywhere.
assert!(!glob.is_match("FOO/qux.bar"));

let (prefix, glob) = Glob::partitioned("foo/*.bar").unwrap();
// On Windows, this will descend into `./FOO`, which disagrees with `is_match` above.
for entry in glob.walk(Path::new(".").join(prefix), usize::MAX) { /*...*/ }
olson-sean-k commented 3 years ago

I believe this is fixed by the changes currently on the flag branch. :tada: Those changes introduce flags as described above and move state into the parser so that each literal token can read its case sensitivity from the latest flag state established during parsing. That information is used to determine the variance of literals when partitioning glob expressions.

As an aside, those changes add complexity to the parser, which must consider flags carefully. I should be sure to add even more testing around this...

olson-sean-k commented 3 years ago

Fixed in bc52751. :tada: