Open chuanran opened 3 years ago
watchmedo
uses pathlib.PurePath.match() for matching paths with patterns. This means it is possible, for example, to ignore everything under a directory /path/to/src/tests/
as follows:
watchmedo auto-restart -d /path/to/src --ignore-patterns='/path/to/src/tests/*;/path/to/src/tests/**/*' -- ...
The need to use both path/**/*
and path/*
is somewhat annoying, and I couldn't find a way around it, but the important thing is that this is possible!
As a path to improving watchmedo
, it seems that using functions from the fnmatch
module instead of pathlib.Path.match()
would make this simpler, as one could simply use path/*
. That would be a backwards-incompatible change, though, and in my opinion it wouldn't be worth it just for this. (But using pathlib.Path.match()
as currently done has some other disadvantages as well, such as the pattern "b/c.py"
matching a file "a/b/c.py"
.)
Given that the current behavior is not ideal, and not intuitive, we could think about moving to fnmatch
. I would be OK with such enhancement.
For future reference, if someone does implement something like this, it seems it would be better to use gitignore-style path specifications (e.g. via the pathspec library), since fnmatch is aimed at matching file names rather than paths, and has some drawbacks for this use case due to that.
Also, to allow providing multiple path specs via cmdline, it would probably be better to allow specifying --pattern(s)
and/or --ignore-pattern(s)
multiple times rather than using a separator character as is currently done.
@taleinat
It seems that watchmedo auto-restart
currently doesn't support recursive patterns. I believe that for pathlib.PurePath.match(), path/**/*
is equivalent to path/*/*
, which only takes into account 2 levels of nesting. This is also what happens in my testing: changes deeper than 2 layers are not taken into account by either --patterns or --ignore-patterns. You can confirm this behavior in the python shell:
$ python3
>>> from pathlib import PurePath
>>> PurePath("a/b/c/d.txt").match("b/**/*")
True
>>> PurePath("a/b/c/d/e.txt").match("b/**/*")
False
This means that there is no reliable way to, for example, watch for recursive changes in directory my_project
while recursively ignoring anything that happens in my_project/build
. My current workaround is to construct ignore patterns programmatically with many layers of nesting i.e. build;build/*;build/*/*;build/*/*/*;[...]
, which, as you can imagine, not ideal.
I believe this is a real functionality limitation and I'm wondering if it's worth introducing additional parameters, such as --glob-pattern(s)
and --ignore-glob-pattern(s)
, that would use more of a glob-style pattern instead?
Appreciate your help on this!
You appear to be right, @guitarino! It seems that pathlib.PurePath.match() doesn't handle **
the way I thought it did.
@BoboTiG, how would you suggest approaching this in terms of maintaining backwards-compatibility? Add a new CLI flag for this?
Am I right if I say that pathspec
is the way to go, and will handle all cases out of the box?
About backward compatibility, I would say there is no really issue: we are trying to fix the initial behaviour. I would vote for fixing the current behaviour without introducing new arguments.
Am I right if I say that
pathspec
is the way to go, and will handle all cases out of the box?
I don't know, but without testing various cases, I wouldn't assume that.
About backward compatibility, I would say there is no really issue: we are trying to fix the initial behaviour. I would vote for fixing the current behaviour without introducing new arguments.
I do think that if any currently valid use of --patterns
or --ignore-patterns
, changes in behaviour due to this change, that would be a backwards-compatibility break. This includes edge cases!
I would exclude uses of .../**/...
from the above statement, since those quite obviously don't behave as anyone would expect, but I'd be careful before changing pretty much anything else. And there are so many edge cases with paths...
Unless we can retain full backwards-compat, it might be better to cut a new major version for this change, and clearly document it in the major changes and an upgrade guide.
Unless we can retain full backwards-compat, it might be better to cut a new major version for this change, and clearly document it in the major changes and an upgrade guide.
Yes, indeed. It would be part of a major release. What I would like to prevent is adding new CLI arguments.
Looks like it's been about 9 months since this issue has gotten a comment. Obviously, this would be a breaking change and require a more significant release. Any thoughts to share about a release that would include the arbitrary-depth-globbing feature?
The release is not an issue at all. We need someone to steps in, and provides a pull-request first.
From the usage of
watchmedo auto-restart
, we can see the only possible way we can use to exclude a specific folder is using--ignore-pattern
parameter, as following shows.Checking the source code of
watchmedo
, we can find thatwatchmedo
usesPurePosixPath
to match the path and patterns (including ignore-patterns), and it is NOT supportingregex
at all, so we cannot useregex
to match the pattern.in the doc for
match
method ofPurePosixPath
, can find that it doesNOT
support recursively searching and matching files in a specific folder and its sub-folders.Is it possible for watchmedo to support such feature that can ignore any changes for a specific folder and its subfolders recursively? Thanks