PyFilesystem / pyfilesystem

Python filesystem abstraction layer
http://pyfilesystem.org/
BSD 3-Clause "New" or "Revised" License
288 stars 63 forks source link

OSFS hates backslashes. #139

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. mkdir /tmp/D\:\\Foobar
2. fs = osfs.OSFS('/tmp')
3. fs.listdir('/')

fs.errors.ResourceNotFoundError: Resource not found: /D:/Foobar

What is the expected output? What do you see instead?

I should see a directory listed with the name "D:\Foobar"

An exception is raised because D:/Foobar is an invalid path (which it is). The 
underlying problem is that in getsyspath():

path = relpath(normpath(path)).replace("/",os.sep)

normpath() replaces "\" with "/". I am not sure why that is necessary, but it 
causes unexpected results.

Is this replacement in place to "save me from myself" or is there some other 
reason for it? I think that if the caller passes a path, it should be used 
verbatim. If the path is invalid as provided an errors should be returned, but 
making a valid path invalid is not desirable.

Original issue reported on code.google.com by btimby@gmail.com on 19 Nov 2012 at 9:33

GoogleCodeExporter commented 9 years ago
So this is on Linux? To be honest with you, I didn't know that a backslash was 
a valid character in a path.

The os.sep adjustment should only be done when constructing the OSFS. I suspect 
this overzealous slash replacement was to avoid having different behaviour on 
Windows, but it is clearly flawed.

Would modifying normpath to not do the replace be enough to fix this issue 
(without breaking tests)?

Original comment by willmcgugan on 20 Nov 2012 at 12:55

GoogleCodeExporter commented 9 years ago
Yes, on Linux. Linux does not make many assertions about what characters can be 
used in a path. I believe / and \0 are the only restrictions. I am not 
advocating the use of \ in Linux paths, but the software I maintain runs on 
Linux and is used by Windows users, and they do some funny things ;-).

Simply removing the replace() call in normpath() causes everything to work. I 
have not yet executed the tests, I will.

Original comment by btimby@gmail.com on 20 Nov 2012 at 2:46

Attachments:

GoogleCodeExporter commented 9 years ago
A couple of unit tests failed, they specifically checked whether '\' was 
replaced with '/'. I modified the tests to check that '\' is preserved.

Original comment by btimby@gmail.com on 20 Nov 2012 at 4:42

Attachments:

GoogleCodeExporter commented 9 years ago
Just been doing some testing on this myself... while your patch works on Linux, 
it makes all the TestMountFS and TestMountFS_stacked unit tests break on Win32 
:-(

Done a bit of debugging, and it seems the fix is to replace the 'os.path.join' 
used by TestMountFS.check() and TestMountFS_stacked.check() (in 
fs/tests/test_fs.py) to 'pathjoin' instead (since the pyfs commands now don't 
treat os.sep on Win32 as a path-separator i.e. os.path.exists works with 
os.path.join, but fs.exists only works with fs.path.join). And there's still 
failing tests with fs.tests.test_watch.TestWatchers_TempFS which I haven't 
bothered to track down yet.

I guess you'll also want to modify your patch to fix the 
_requires_normalization regex and the normpath docstring (in fs/path.py) ?

Hmmm... although now that I think about it some more, maybe we /should/ keep 
the original behaviour? As the way I understand it, pyfilesystem is supposed to 
be a "lowest common denominator" type thing and work the same across all 
platforms, and with this proposed change fs.makedir("some\\strange\\filename", 
recursive=True) will now do something *very* different on Linux than it would 
do on Windows...  :-|

Original comment by gc...@loowis.durge.org on 21 Nov 2012 at 1:32

GoogleCodeExporter commented 9 years ago
I did not bother changing the regex, because the replace was done BEFORE 
checking the regex, so I assumed the two were not related. Attaching a patch 
with a modified regex.

The original behavior is acceptable if there is a way to use valid paths with 
the library. Some method of escaping the '\' so that it is treated literally 
rather than as a pathsep (it seems the original intent was to normalize the 
pathsep).

The problem I see with escaping is that the library itself will require heavy 
modification, since "input" in this case includes function parameters as well 
as return values from the os (listdir() in particular). For my case, I am not 
even passing the path, the failure is completely within the fs.listdir() 
function (since it calls os.listdir(), then os.stat() on the path in question).

Original comment by btimby@gmail.com on 21 Nov 2012 at 2:39

Attachments:

GoogleCodeExporter commented 9 years ago
Wrong patch in last comment.

Original comment by btimby@gmail.com on 21 Nov 2012 at 5:05

Attachments:

GoogleCodeExporter commented 9 years ago
I'm going to raise this on the mailing list. It's such a fundamental thing, we 
should probably hammer it out. I'm not sure of the right way to go myself.

Original comment by willmcgugan on 21 Nov 2012 at 9:42

GoogleCodeExporter commented 9 years ago

Original comment by willmcgugan on 10 Sep 2013 at 8:52