Open mitchellwrosen opened 9 months ago
This is, essentially, a documentation bug.
The isUnescapedInURI
function is answering the question, "Can this appear at all in a URI?" The docstring for it gives an example URI containing non-ASCII characters (with umlauts and such), and those absolutely have to be escaped before they can be included.
The reserved characters like ?
are allowed to appear in a finished URI, so both functions return True. But the docstring for isReserved
is trying to put you on the right path: If you're forming a URI out of parts, and one of those parts contains a reserved character, you'd better escape it.
In fact, the companion function for isUnescapedInURI
, namely isUnescapedInURIComponent
is going to be the more useful one: If you are forming a URI out of parts and including arbitrary strings, you should use that one to escape the parts. In fact, I'm not sure what you would use isUnescapedInURI
for.
I'll have a go at improving the docstrings for the isUnescaped
functions.
Well, after playing it with it for a bit more, I realized isUnescapedInURIComponent
is rarely what you want, either. It will encode, say, a slash character, which is rarely what you want when forming a path, say:
>>> URI {
>>> uriScheme = "http:",
>>> uriAuthority = Nothing,
>>> uriPath = escapeURIString isUnescapedInURIComponent "/foo/b?ar/baz", -- you want the question mark escaped
>>> uriQuery = "",
>>> uriFragment = ""
>>> }
http:%2Ffoo%2Fb%3Far%2Fbaz
The result escapes the question mark as desired, but also the slashes, which would not mess up the parsing at that point and you'd usually keep them unescaped.
I will still try to improve the documentation, although I'll be doing some gymnastics to try to make either of these functions sound useful...
isUnescapedInURI
documentation says:However, its implementation is:
where
isReserved
documentation says:So, it seems to me that if
isReserved
returnsTrue
, thenisUnescapedInURI
ought to returnFalse.