Closed mcbuddha closed 7 years ago
A reserved character doesn't necessarily mean we encode it. "/" is also a reserved character, but we don't encode that in a URL. Sub-delimiters are only significant in certain segments of the URL. We should actually include all sub-delimiters as exceptions. Possibly also add an option to customize the set of untouched characters, though I'm less certain about whether that's a good idea.
I understand that just because +
is reserved, it doesn't mean that it should be encoded. However, it is also absent from the "unreserved" list, so making it explicitly unreserved (and thus unencodable) is in conflict with the RFC.
I guess clients could not call url-encode
for the parts they don't want to encode (because they know that those are sub-delimiters). Without the PR it's pretty difficult to encode a +
sign (if you know that it is not a sub-delimiter, but a query param or whatever else).
If it's a query parameter, then form-encode
should be used. Ring-Codec distinguishes between URL-encoding, and encoding data in the "x-www-form-urlencoded" format. Often the two formats are conflated, but they have different semantics.
In general, if you're encoding a query string or form, then form-encode
should be used. If you're encoding something that will sit in the path of the URI, then url-encode
should be used.
Got it! Thanks
As an aside, the current functionality stems from a problem someone had with handling a URL like http://example.com/tags/foo+bar
. When the last path segment was decoded with Java's URL decoder, it produced "foo bar". Correct if it was decoding a query string, but not necessarily for a path segment.
https://tools.ietf.org/html/rfc3986#section-2.2