Closed pedro-r-marques closed 9 months ago
Hi @pedro-r-marques
thank you for opening an issue!
I do have a pretty good idea where the error is occurring here, but I'd love to have some more information from you beforehand.
file:
, i.e. how did you generate your URIs?Basically file://var/tmp
is interpreted as host=var
and path=/tmp
and we currently seem to ignore the host for local paths. The question is, how we should address this in the best way. Very likely with both: better documentation AND actually using the host and erroring with a clear error msg.
Cheers, Andreas 😃
I initially tried "file://" expecting it to follow other url-like parameters. If the implementation interprets the first string as hostname it should make sure that the value is usable and throw an error otherwise. i.e. if hostname != 'localhost' and hostname != '' then it should imho raise an ArgumentError.
~Notes:~
~We should provide a correct error message for now, and in case pathlib or we decide to support parsing relative path uri's we can switch, see:~
file:path/file.txt
~moved to new issue #142
file://foo/bar
does not represent a relative path on Windows - it represents a UNC path \\foo\bar
.
In my CPython PR I made it work the same on POSIX, and so:
>>> pathlib.Path.from_uri('file://var/tmp')
PosixPath('//var/tmp')
Note that paths with two leading slashes are allowed by POSIX:
A pathname consisting of a single slash shall resolve to the root directory of the process. A null pathname shall not be successfully resolved. A pathname that begins with two successive slashes may be interpreted in an implementation-defined manner, although more than two leading slashes shall be treated as a single slash.
Thanks for the clarification @barneygale!
So does that mean file://localhost/absolute/path/file.txt
would actually not map to /absolute/path/file.txt
on posix systems, but to the path //localhost/absolute/path/file.txt
?
how does that fit together with rfc8089? Are "file" ":" "//" "auth-part"
-style URIs with a host not compatible with posix?
I've never come across file URIs that use file://localhost/
in the wild, but they should probably be handled specially. I'll adjust the PR.
The RFC also says:
The "host" is the fully qualified domain name of the system on which
the file is accessible. This allows a client on another system to
know that it cannot access the file system, or perhaps that it needs
to use some other local mechanism to access the file.
... which works fine on Windows, but not on other systems :(
Fixed in universal_pathlib==0.2.0
. Behavior for "file://path" uris is now identical to fsspec.
>>> import os
>>> os.getcwd()
'/home/user'
>>> import upath
>>> PATHS = ['/tmp/test', 'file:/tmp/test', 'file://tmp/test', 'file:///tmp/test']
>>> {p: upath.UPath(p) for p in PATHS}
{
'/tmp/buh': PosixUPath('/tmp/buh'),
'file:/tmp/buh': FilePath('file:///tmp/buh'),
'file://tmp/buh': FilePath('file:///home/user/tmp/buh'),
'file:///tmp/buh': FilePath('file:///tmp/buh'),
}
I'm seeing this issue in both macos and linux
The variant "file://var/tmp" gives me the contents of "/tmp". The other variants work as expected.