Closed travv0 closed 5 years ago
Actually, with this change "foo/**/.baz/asdf"
no longer matches "foo/bar/.baz/asdf"
. I'll look at it some more.
I'm pretty new at Haskell so I'm guessing this could be more efficient/easier to read, but basically the problem was that it was using init
to get the paths to check for dots, so if any of them were hidden besides the last file/directory, even if they were matched explicitly and not by the **
, it was returning False
. Please let me know what I can do to improve this.
Edit: To add to this, the code was originally only checking past the first directory if the glob ended in "**/", which was the cause of the original problem.
Thanks for the feedback! I'll take a look at this stuff soon.
On Tue, Dec 18, 2018 at 2:59 PM Matti Niemenmaa notifications@github.com wrote:
@Deewiant requested changes on this pull request.
Yeah, this was definitely not formulated as well as it could've been. (I was pretty new at Haskell when I wrote the whole library and haven't managed to take the time to clean things up at all.) Thanks for the PR and especially the test cases!
I think the approach itself is correct, just found some minor complaints.
In System/FilePath/Glob/Match.hs https://github.com/Deewiant/glob/pull/30#discussion_r242698211:
- in if null xs && not (matchDotsImplicitly o)
- -- **/ shouldn't match foo/.bar, so check that remaining bits don't
- -- start with .
- then all (not.isExtSeparator.head) (init parts) && matches
- else matches +match' o (AnyDirectory:xs) path =
- if not (matchDotsImplicitly o)
- -- **/ shouldn't match foo/.bar, so check that remaining bits don't
- -- start with .
- then all (not.isExtSeparator.head) matchedDirs && hasMatch
- else hasMatch
- where parts = pathParts (dropWhile isPathSeparator path)
- matches = map (match' o xs) parts
- hasMatch = or matches
- matchIndex = fromMaybe 0 (elemIndex True matches)
- (matchedDirs, _) = splitAt matchIndex parts
Since you're not using the other half of the split, this could be simplified using take.
In System/FilePath/Glob/Match.hs https://github.com/Deewiant/glob/pull/30#discussion_r242699009:
@@ -122,14 +124,17 @@ match' o (OpenRange lo hi :xs) path = match' o again@(AnyNonPathSeparator:xs) path@(c:cs) = match' o xs path || (not (isPathSeparator c) && match' o again cs)
-match' o again@(AnyDirectory:xs) path =
- let parts = pathParts (dropWhile isPathSeparator path)
- matches = any (match' o xs) parts || any (match' o again) (tail parts)
- in if null xs && not (matchDotsImplicitly o)
- -- **/ shouldn't match foo/.bar, so check that remaining bits don't
- -- start with .
- then all (not.isExtSeparator.head) (init parts) && matches
- else matches +match' o (AnyDirectory:xs) path =
- if not (matchDotsImplicitly o)
Flipping the then/else cases and removing this not would be nicer since it avoids a double negation.
In System/FilePath/Glob/Match.hs https://github.com/Deewiant/glob/pull/30#discussion_r242700086:
@@ -122,14 +124,17 @@ match' o (OpenRange lo hi :xs) path = match' o again@(AnyNonPathSeparator:xs) path@(c:cs) = match' o xs path || (not (isPathSeparator c) && match' o again cs)
-match' o again@(AnyDirectory:xs) path =
- let parts = pathParts (dropWhile isPathSeparator path)
- matches = any (match' o xs) parts || any (match' o again) (tail parts)
- in if null xs && not (matchDotsImplicitly o)
- -- **/ shouldn't match foo/.bar, so check that remaining bits don't
- -- start with .
- then all (not.isExtSeparator.head) (init parts) && matches
- else matches +match' o (AnyDirectory:xs) path =
- if not (matchDotsImplicitly o)
- -- **/ shouldn't match foo/.bar, so check that remaining bits don't
- -- start with .
This comment is a bit out of date now, since it talks about "remaining bits" but it's now concerning "the directories matched by **/" or something along those lines.
In System/FilePath/Glob/Match.hs https://github.com/Deewiant/glob/pull/30#discussion_r242701631:
- matches = any (match' o xs) parts || any (match' o again) (tail parts)
- in if null xs && not (matchDotsImplicitly o)
- -- **/ shouldn't match foo/.bar, so check that remaining bits don't
- -- start with .
- then all (not.isExtSeparator.head) (init parts) && matches
- else matches +match' o (AnyDirectory:xs) path =
- if not (matchDotsImplicitly o)
- -- **/ shouldn't match foo/.bar, so check that remaining bits don't
- -- start with .
- then all (not.isExtSeparator.head) matchedDirs && hasMatch
- else hasMatch
- where parts = pathParts (dropWhile isPathSeparator path)
- matches = map (match' o xs) parts
- hasMatch = or matches
- matchIndex = fromMaybe 0 (elemIndex True matches)
As a tiny performance thing, I think referencing matches twice like this is likely to prevent the intermediate list from getting optimized away. I'm not entirely sure though, feel free to examine the compiler output if you want to check. An alternative would be:
matchIndex = elemIndex True matches hasMatch = isJust matchIndex (matchedDirs, _) = splitAt (fromMaybe 0 matchIndex) parts
At that point you could also go about things more directly and get rid of matches, using findIndex (or whatever it's called) instead of elemIndex on top of map.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Deewiant/glob/pull/30#pullrequestreview-186272739, or mute the thread https://github.com/notifications/unsubscribe-auth/ADcEEpM00yTFqPpfXmdUNLGwf21U95enks5u6VcVgaJpZM4ZXa2E .
I've fixed those issues. I did have a question about how short-circuiting works in Haskell. Would
all (not.isExtSeparator.head) matchedDirs && hasMatch
be more efficient if written as
hasMatch && all (not.isExtSeparator.head) matchedDirs
?
It would be better like that now, yes. Previously the isExtSeparator
check had no dependency on any recursive match results, so I wrote it that way around on the assumption that it's much quicker than the recursive calls involved in hasMatch
(which was called matches
prior to this PR). Currently you need to compute matchIndex
for either side of the &&
, so might as well do the cheaper hasMatch
check first.
(Whether this actually ends up being more efficient for the user depends on how often hasMatch
is true or false...)
fixes #29
There was a check for
null xs
that was preventing it from checking directories past the first one.