PyFilesystem / pyfilesystem2

Python's Filesystem abstraction layer
https://www.pyfilesystem.org
MIT License
1.98k stars 175 forks source link

OSFS failed to consume Windows file uri: #341

Open chfw opened 5 years ago

chfw commented 5 years ago

Hi all,

Referring to the blog (https://blogs.msdn.microsoft.com/ie/2006/12/06/file-uris-in-windows/) shared by @lurch , the proper windows file uri for: D:\Program Files\Viewer\startup.htm is:

Incorrect: file://D:\Program Files\Viewer\startup.htm
Correct: file:///D:/Program%20Files/Viewer/startup.htm

Please notice file:/// over file://. Here is how it manifest itself in appveyor:

======================================================================
2095ERROR: test_consume_geturl (tests.test_tempfs.TestTempFS)
2096----------------------------------------------------------------------
2097Traceback (most recent call last):
2098  File "C:\projects\pyfilesystem2\tests\test_osfs.py", line 172, in test_consume_geturl
2099    open_fs(base_dir)
2100  File "C:\Python36-x64\lib\site-packages\fs-2.4.11a0-py3.6.egg\fs\opener\registry.py", line 228, in open_fs
2101    default_protocol=default_protocol,
2102  File "C:\Python36-x64\lib\site-packages\fs-2.4.11a0-py3.6.egg\fs\opener\registry.py", line 189, in open
2103    open_fs = opener.open_fs(fs_url, parse_result, writeable, create, cwd)
2104  File "C:\Python36-x64\lib\site-packages\fs-2.4.11a0-py3.6.egg\fs\opener\osfs.py", line 41, in open_fs
2105    osfs = OSFS(path, create=create)
2106  File "c:\projects\pyfilesystem2\fs\osfs.py", line 139, in __init__
2107    raise errors.CreateFailed("'{}' does not exist".format(_root_path))
2108fs.errors.CreateFailed: 'C:\C:\Users\appveyor\AppData\Local\Temp\1\tmp67y_ixmw__tempfs__' does not exist

https://ci.appveyor.com/project/willmcgugan/pyfilesystem2/builds/26552739/job/1478h3r51d5renwg

Here is how to reproduce it with fs:

>>> import fs
>>> path = 'file:///C:/Users/appveyor/AppData/Local/Temp/1/tmpn_r3w0_tfstestosfs/test.file'
>>> fs.open_fs(path)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jaska/github/moremoban/pyfilesystem2/fs/opener/registry.py", line 228, in open_fs
    default_protocol=default_protocol,
  File "/Users/jaska/github/moremoban/pyfilesystem2/fs/opener/registry.py", line 189, in open
    open_fs = opener.open_fs(fs_url, parse_result, writeable, create, cwd)
  File "/Users/jaska/github/moremoban/pyfilesystem2/fs/opener/osfs.py", line 41, in open_fs
    osfs = OSFS(path, create=create)
  File "/Users/jaska/github/moremoban/pyfilesystem2/fs/osfs.py", line 139, in __init__
    raise errors.CreateFailed("'{}' does not exist".format(_root_path))
fs.errors.CreateFailed: '/C:/Users/appveyor/AppData/Local/Temp/1/tmpn_r3w0_tfstestosfs/test.file' does not exist
>>> from fs.opener.parse import parse_fs_url
>>> parse_fs_url(path)
ParseResult(protocol='file', username=None, password=None, resource='/C:/Users/appveyor/AppData/Local/Temp/1/tmpn_r3w0_tfstestosfs/test.file', params={}, path=None)

I am still thinking of a fix.

lurch commented 5 years ago

As @willmcgugan hinted at in the other issue, there's likely to be subtle differences between "external URLs" (as returned by geturl or geturl(purpose='download')) and FS URLs (as returned by geturl(purpose='fs') and 'consumed' by open_fs).

There may be places in which an external-URL and an FS-URL are identical, but there may also be corner-cases in which they're different. @willmcgugan perhaps the documentation at https://pyfilesystem2.readthedocs.io/en/latest/openers.html could / should be made clearer in this regard? Or should open_fs (and OSFS.geturl) be adapted to better work with external-URLs? (with external-URLs still being a subset of FS-URLs)

willmcgugan commented 5 years ago

The docs could definitely use more detail here. I will update that once @chfw has merged his changes.

Generally though, FS URLs generated with purpose="fs" don't need to be compatible with anything outside PyFilesystem. So they don't really need to follow the spec for file:// urls as long as they work with open_fs.

In the default case where purpose="download" the URLs should follow the appropriate spec. I can see from the current implementation in osfs.py that the geturl implementation is naive.

@chfw keep us posted!