Open EmmaJaneBonestell opened 2 years ago
Thanks for the idea @EmmaJaneBonestell. I'm not too familiar with desirable Windows behaviour, and am paging in @dra27 @jonahbeckford to see if they have any opinions on this. Conforming to RFC 8089 seems sensible to me at any rate.
The Microsoft System.Uri
class claims to adhere to RFC 3986 (and RFC 2396, RFC 2732, and RFC 3987): https://learn.microsoft.com/en-us/dotnet/api/system.uri?view=net-7.0
You can see the result in PowerShell:
C:\temp> New-Object System.Uri("C:\Program Files (x86)\")
AbsolutePath : C:/Program%20Files%20(x86)/
AbsoluteUri : file:///C:/Program%20Files%20(x86)/
LocalPath : C:\Program Files (x86)\
Authority :
HostNameType : Basic
IsDefaultPort : True
IsFile : True
IsLoopback : True
PathAndQuery : C:/Program%20Files%20(x86)/
Segments : {/, C:/, Program%20Files%20(x86)/}
IsUnc : False
Host :
Port : -1
Query :
Fragment :
Scheme : file
OriginalString : C:\Program Files (x86)\
DnsSafeHost :
IdnHost :
IsAbsoluteUri : True
UserEscaped : False
UserInfo :
So yes, including C:
as part of the PathAndQuery
seems the right thing to do.
But I think Microsoft is right in that the path should not have backslashes in it. Is that what Uri.Path is returning today? https://www.ietf.org/rfc/rfc2396.txt Section 3 (Page 10) is clear on (forward) slashes:
URI that are hierarchical in nature use the slash "/" character for separating hierarchical components.
and it labels as "unwise" the use of backslash higher on the same Page 10:
Other characters are excluded because gateways and other transport agents are known to sometimes modify such characters, or they are used as delimiters.
unwise = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"
Is it too much to ask for a fixup of forward slashes to get proper Windows URI compliance?
Actually I'm seeing a different problem; the applications I'm interating with do send the URI properly encoded file:///c:/Windows/foo.dll
for example.
However, Uri.path
will return /c:/Windows/foo.dll
which is not a valid Windows path, note that c:/Windows/foo.dll
is.
Thus I'm unsure how this should work, but for now I have to post-process all my Uri.path
results on Windows if they contain a drive letter as to remove the starting slash.
VsCode for example does this massaging here:
While RFC 3986 does not specify how to handle drive letters, the majority of implementations include the drive letter (e.g. C:\ , D:\, ...) on Windows, and so omitting it does not really match up with real-world behavior or applications. Newer RFCs/standards also use this behavior, e.g. RFC 8089 , WhatWG .
Without it, it is impossible to use solely Uri.path to return a path on the non-current drive.
Examples:
Both return:
which only gives a path on the current drive, while, I feel, a more appropriate result would be:
Inconsistently, in these basic examples, using Uri.to_string (instead of Uri.path) will return with the drive letter.