Open AndydeCleyre opened 5 years ago
This occurs on Windows too. Folder D:\TXT_FILES[test]
with files a.txt
and b.txt
running
folder = local.path(r"D:\TXT_FILES[test]")
folder // "*.txt"
# Results in: []
Doing the same but with the [test]
part in the name it works.
I just experienced this - essentially, the problem is that the documents imply that the //
operator globs a wildcard on the right-hand-side, but actually it appears to glob the whole appended thing, which is destructive of the well-formed filenames required to form local.path
, it can actually make a local.folder which presents is_dir() = True, with files that match the glob, appear to contain nothing (because of course once the square brackets are globbed they don't match themselves).
Here's a test run. What I'm doing here is setting up a case where the directory will actually switch to a neighboring folder when globbing is applied, giving a completely preposterous result.
$ mkdir 'this[test]'
$ touch this\[test\]/a
$ mkdir thise
$ touch thise/b
$ fg
[3] 785992 continued python3.8
>>> p = local.path('this[test]/')
>>> p.is_dir()
True
>>> (p/'a').is_file()
True
>>> [s.is_file() for s in p//('a', 'b')]
[True]
>>> {str(s):s.is_file() for s in p//('a', 'b')}
{'/mnt/hdd-8TB/sharing/downloads/thise/b': True}
>>>
I tried to make a workaround using a 'with cwd' context-manager, it exploded worse than expected. This might be my own mistake, or a different bug.
>>> with local.cwd(p) as path:
... print('.' // ('a', 'b'))
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/home/william/.local/lib/python3.8/site-packages/plumbum/path/local.py", line 385, in __call__
newdir = self.chdir(newdir)
File "/home/william/.local/lib/python3.8/site-packages/plumbum/path/local.py", line 370, in chdir
os.chdir(str(newdir))
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/hdd-8TB/sharing/downloads/this\\[test\\]'
>>>
OK, I found a kind of workaround, you have to use a different library to do the globbing, or just use regexp; but an expression like p // '*.mp3'
can be replaced by [f for f in p if f.suffix == 'mp3']
. It's a bit wordy, but not as bad as things might be. Obviously we could also use p.walk
for more complex globs.
But this whole thing makes //
a lot less generally useful.
So the bug is that a // b
causes the glob to apply to a? That might be fixable.
Exactly, a // b applies the glob to the concatenation of a/b instead of scanning for glob matches to b strictly under the path a. And you can see the implementation problem, although in different forms, in the _glob functions in both remote and local paths -- both of them apply the globbing after concatenating the glob to the full pathname, without any escapes to protect the path.
Also, I found the documentation for this, and it is indeed a mismatch with the documentation when compared to the example I gave just above.
https://plumbum.readthedocs.io/en/latest/api/path.html#api-path "Returns a (possibly empty) list of paths that matched the glob-pattern under this path."
In my example, I construct a pair of paths designed to subvert that, making an unsuspecting use of globbing look outside of the specified path. This doesn't seem possible to defend against, aside from programmer paranoia or just never using globbing.
The .walk
function and the __iter__
constructs are safe from this. A
temporary fix would be to reimplement glob() or floordiv in terms of
those. Odds are we can do better, but I don't know the real constraints.
-Wm
On Sat, Jul 10, 2021 at 8:53 AM Henry Schreiner @.***> wrote:
So the bug is that a // b causes the glob to apply to a? That might be fixable.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tomerfiliba/plumbum/issues/481#issuecomment-877659583, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7H6IRVYHPR2FYGLQEV2DTXBUJJANCNFSM4JGSVSTQ .
I just started looking for globbing in the existing tests, and it's not directly related to this issue but it looks like this assert
ion will always succeed if the previous one does.
Arch Linux Python 3.7.4 Plumbum 1.6.7
While only that last one raises an exception, none of those globs match properly.