Open earshinov opened 6 months ago
Thanks for the detailed research. It's taken me a while to get back to looking at django-watchfiles.
Ideally Python should offer some equivalent to directory.glob(pattern) -> filenames, which would allow one to validate a filename against a glob within a given directory or, to the same extent, to validate a relative path against a glob, something like globmatch(relative_path, glob) -> bool.
Yes, that would be great.
Maybe we can reuse some internals of the glob
module, which is what Path.glob()
relies on.
I would like to check that before adding a dependency. globber
does not look well-maintained, with the last release in 2019.
Perhaps we can copy in the relevant source instead, or find a hybrid where we rely on some internals of glob
.
Python Version
3.10.6
Django Version
4.1.3
Package Version
0.1.1
Description
Intro
Let me start with an example. Here is an example Django app configuration:
my_app/apps.py:
How are these patterns handled by Django's default
StatReloader
: https://github.com/django/django/blob/f030236a86a64a4befd3cc8093e2bbeceef52a31/django/utils/autoreload.py#L411TLDR:
directory.glob
is used, so a pattern like**/.env
would, for example, match a file named.env
placed directly in the given directory.Here is what django-watchfiles is doing: https://github.com/adamchainz/django-watchfiles/blob/fb2cbc2d08a45301d293302e4228e377aec2f6b1/src/django_watchfiles/__init__.py#L64
TLDR: It attempts to check if the relative path from
directory
to a changed file satisfies a glob pattern withfnmatch
The problem
Using
fnmatch
like this is no replacement fordirectory.glob(...)
. You can get more intuition, for example, here: https://stackoverflow.com/questions/27726545/. For sake of illustration:fnmatch
does not handle**
at all: https://docs.python.org/3/library/fnmatch.htmlfnmatch
matches the given path right to left and reports a successful match even if the path does not match fully. So, whilePath('/home/user/').glob('a.txt')
would only returna.txt
placed directly in the given directory,fnmatch('a.txt', 'a.txt')
andfnmatch('some/nested/folder/a.txt', 'a.txt')
would both return true.So, user gets different results when using the same pattern depending on whether
StatReloader
orWatchfilesReloader
is used.Possible solutions
Ideally Python should offer some equivalent to
directory.glob(pattern) -> filenames
, which would allow one to validate a filename against a glob within a given directory or, to the same extent, to validate a relative path against a glob, something likeglobmatch(relative_path, glob) -> bool
.Unfortunately, there is no such thing in Python's standard library. Also, as far as I discovered, there are no third-party packages that specifically aim to offer such an equivalent to
Path.glob
. However, there are packages that offer similar glob-matching.I have prepared a test project comparing
Path.glob
withfnmatch
, globmatch and globber:fnmatch
andglobmatch
are junk, butglobber
did well, matching the behavior ofPath.glob
in my test cases exacltly. Here is a Google sheet with the results (those matchingPath.glob
highlighted with green): https://docs.google.com/spreadsheets/d/1M2fcYlW19n1HACQi_MbYgBIMtbopnGur6mVjyNaowqc/Based on that, I would suggest replacing
fnmatch
withglobber
.globber
is not a mature project and is not widely known, but it certainly improves the situation. Given that people haven't discovered that additional globs, basically, just don't work as expected withdjango-watchfiles
, they probably won't have issues withglobber
either. If / when they do,globber
's source code is only some 30 lines, so it should be fairly easy to tweak it if needed: https://github.com/asharov/globber/blob/main/globber/globber.py