Open dgryski opened 1 year ago
Change https://go.dev/cl/450375 mentions this issue: net/url: accept invalid percent encodings
Thanks Ian.
Doesn't loosening of URI validation risk propagating invalid %-encodings to other systems that will choke on the invalid URIs (like, for example, something built with go1.19)? This seems opposite to the logic used to justify tightening of ParseIP logic in go1.17 (https://github.com/golang/go/issues/30999).
I would not have expected the stdlib method to relax validation unconditionally like this
aside from the question about whether allowing these invalid sequences is going to cause problems with other systems, stdlib methods are widely used to validate things like IPs and URIs which are persisted and have to interoperate with multiple go versions. If this is going to update go1.20 to accept URIs as valid which go1.19 rejected as invalid, can this also include guidance for consumers to roll out this change?
if my "git archeology" is correct, the whatwg spec has changed
onJan 14, 2014
,
https://github.com/whatwg/url/commit/3cfaa1779bfb9a3ba2b907a5802ca0251ca9a7e6
that predates https://github.com/golang/go/issues/11249 , and makes arguable to change the behavior on the parsers at this point, of course, I'm biased due to the implications of this change in kubernetes/kubernetes :)
I guess there are two issues conflated in the earlier bug. The first is "Please support decoding %u0000
as Unicode" (which I am not in favour of) and the second is "%u
shouldn't appear in URLs" (which the current spec allows and provides handling logic for).
Given that 1) there are sites in the wild with %u
in their URLs and 2) other languages support the spec and thus those URLs, I believe that net/url
should too.
Do we want to open another issue as this is closed?
Do we want to open another issue as this is closed?
Yeah, I'll open one on Monday
Change https://go.dev/cl/452795 mentions this issue: Revert "net/url, net/http/httputil: accept invalid percent encodings"
CL is being reverted.
Moving this to Go1.23 as it is quite late in the cycle.
(This is a reopened version of https://github.com/golang/go/issues/11249 because the whatwg spec has changed.)
What version of Go are you using (
go version
)?Go 1.19
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?All.
What did you do?
Tried to parse a URL with an "invalid" percent escape. (In this case,
%u0041
).Playground link: https://go.dev/play/p/SqA0_6oLuo3
What did you expect to see?
Printing the URL path as
/%u0041
What did you see instead?
That there was an error parsing the URL.
I'm reopening this bug because the current version of the URL spec for "Percent Encoded Bytes" ( https://url.spec.whatwg.org/#percent-encoded-bytes )
The JavaScript and popular Rust parsers follow this model, which is making interoperability tricky.
JavaScript:
Rust:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f1134169d2851ebe07da4a45e03ff68e