Closed carl-mastrangelo closed 7 years ago
CL https://golang.org/cl/46834 mentions this issue.
To clarify, there are two related issues going on: ParseQuery and Encode don't round-trip the string. I don't think this is technically possible (we don't know if =
should be kept), but it is possible to pick a better lossy translation.
Trying to work around this behavior by explicitly setting the value to nil:
v := make(url.Values)
v["prev"] = nil
doesn't produce any output.
We probably can't change the v["prev"] = nil
behavior to start generating output. That would probably break enough people. A lot of code & tests have accumulated over 5 years depending on every quirk of our implementations.
It would be nice to interpret a url.Values with nil value slice as a flag, and skip the = for encoding.
Using nil or even a sentinel []string value would be unable to represent ?foo&foo&foo
: https://play.golang.org/p/nO_OlSCiCE
You'd almost need a sentinel string value, but strings in Go are compared by value, which means we'd need to make up some magic value like:
package url
// EmptyValue is blah blah no equals sign.
const EmptyValue = "\x00\x00"
But then that precludes any user from actually using that value.
We did something like this for https://golang.org/pkg/net/http/#TrailerPrefix but TrailerPrefix worked because a valid value can't contain a colon, so we have a safe value to use.
With URL parameter values, all values are unacceptable.
Perhaps an empty sentinel array? It would be hacky no doubt. Something like:
var sentinel [0]string
func EmptyValue() []string {
return sentinel[:]
}
Then, in Values.Encode, it could check if the address of slice[0] matches the address of sentinel. It wouldn't be able to catch the multiple 'foo&foo&foo' case, but since this PR treats it as a flag, just setting it once is the important case. Thoughts?
Again, that doesn't help with https://play.golang.org/p/nO_OlSCiCE
Depending on what level of Hacks are passible, I could imagine creating special strings with a prefix, and then slicing them to put in url.Values, When encoding, reflect and unsafe could be used to look for the sentinel prefix to know if its a special empty:
Sorry, we will not stoop to that level.
I don't think that's stooping. If there is a valid technical reason for disallowing it that would be an acceptable answer. The core of the problem is url.URL is publicly defined to be a map. Without defining a new type, or special parameters, the only remaining solution is to modify behavior. That is what the original CL does.
That said, supported foo&foo&foo
is not a primary goal of the CL. Only that specific by more common problem needs a solution.
I don't think that's stooping.
You may not, but I do, and so would most of the Go team. We avoid unsafe except in very low-level places. The net/url
package is not low-level and we don't want to make it be low-level by using unsafe.
That said, supported foo&foo&foo is not a primary goal of the CL. Only that specific by more common problem needs a solution.
It seems weird to support x=y&foo
but not foo&foo
. I'd prefer to be consistent, even if that meant the current foo=&foo=
behavior.
Ok, expanding the scope of solutions. Please rank these in the most acceptable order:
func(p string) bool
, that decides if it should be empty or not.func(ps []string) []bool
, to allow deciding particular empties (like foo&foo=&foo
ForceEmptyParam
which changes how it encodes. There is precedent for adding such booleans to URL.3, 4, and 5 are API creep, 1 is not a real solution, because one of the other will take its place.
Commentary:
?foo
without an =
sign?x/*
. See #17244.=
makes for shorter urls. This is a good thing. I'm actually kind of surprised this isn't possible today or even tested for, given how many other good API decisions were made in net/.You can see why I picked option 6 CL, because it was the best.
Looking at the Go 1 Compatibility promise, this appears to be undefined behavior that I am proposing to change. If some very small number of tests break, can we proceed with the original change?
We'll discuss at a @golang/proposal-review meeting.
Maybe @dsnet could test the change inside Google (a pretty large codebase) and run all those tests and see how much breaks.
Assigning to myself to go test this.
Can you elaborate why this is an important problem to solve? Like Brad said, it's been five years and has not been a problem. Your original report says only "kind of annoying". We tend not to make major possibly-breaking changes with lots of downside if the only upside is to eliminate "kind of annoying".
Two answers to your question: First, it's a good thing, and second it's okay to make this change.
The reason I even raised this was because of issue 3; I couldn't easily work around this. Reimplementing URL encoding is a bad use of my time, error prone, a code smell, breaks compatibility with the standard library (like with http.PostForm), and even goes against core values of Go:
https://golang.org/doc/faq#x_in_std says:
provide key functionality that many Go programs require, such as formatted I/O and networking. It also contains elements important for web programming, including cryptography and support for standards like HTTP, JSON, and XML.
This is absolutely something important for web programming; it affects the visual representation of how my site links together.
Why is this a safe change to make?
We tend not to make major possibly-breaking changes with lots of downside if the only upside is to eliminate "kind of annoying".
So far that's FUD. We don't know who will break, if anybody. That is why I specifically asked:
If some very small number of tests break, can we proceed with the original change?
Which I believe @dsnet said he would try inside of Google after 1.9 is out the door. I would do this myself, but I am unfamiliar with the Go import process internally. Changing the files and testing globally affected very few tests, implying that the source files aren't actually used. That's neither here nor there.
I asked why the problem is important to solve. The only part of your response that could be viewed as addressing that question is the "shorter URLs are good" line. That sounds like a "nice to have" level importance. Let's work with that.
Round-tripping is not a justification. In general there are multiple possible URLs mapping to any particular representation. For example x=1&y=2 and y=2&x=1 map to the same representation, but no one complains that Encode cannot return the latter. Arguing that a particular form should round-trip is really just arguing that "this is the preferred printing form out of all the equivalent ones."
In this case, "x=" and "x" both map to the same representation, url.Values{"x": {""}}, and Encode prefers to print "x=". If the problem is that shorter URLs should be preferred, it sounds like you are arguing that Encode should instead choose "x"; that is, when value is empty in key=value, the = should be omitted as well. But I don't even see that on your list of choices. Why not?
The solution (6) introduces an unexpected special case that doesn't follow from the general one. It's established that url.Values{"x": {"1", "2"}} means x=1&x=2 and url.Values{"x": {"1"}} means x=1. In general, the URL has len(q["x"]) x parameters. The obvious extension is that url.Values{"x": {}} means no x parameters, not one without an equal sign. Even if no existing tests break, it's a special-case that will likely trip people up for years to come. That cost only makes sense to solve a very important problem. I don't see that level of importance here.
In this case, "x=" and "x" both map to the same representation, url.Values{"x": {""}}, and Encode prefers to print "x=". If the problem is that shorter URLs should be preferred, it sounds like you are arguing that Encode should instead choose "x"; that is, when value is empty in key=value, the = should be omitted as well. But I don't even see that on your list of choices. Why not?
It isn't as clear from this issue, because the CL was sent for review before this issue: https://go-review.googlesource.com/c/46834/1/src%252Fnet%252Furl%252Furl_test.go
As mentioned in the comment, I wanted to make the minimal possible change to enable the behavior.
The solution (6) introduces an unexpected special case that doesn't follow from the general one. It's only unexpected if constructing the url.Value from the map functions rather than the methods. Using Get, Set, Parse, Add, etc. can be made to keep existing behavior.
In the CL I sent out, I use len(vs)
to check if the value is empty, but it doesn't have to. It could check vs == nil
to mean that the =
could be removed. This preserves the most original behavior, because ParseQuery always puts in a zero length slice ( https://play.golang.org/p/QyvlwYsVlK ). url.Values produced by the library would appear unchanged, while allowing my own code to set v[key] = nil, to override the production of =
chars.
I do agree it is a special case, but since there isn't any wiggle room left in the API, it seems the most reasonable. That said, I disagree it is unexpected. It would be impossible(?) to encounter using solely the methods in net/url. The only surprising thing would be if someone today had code that looked like:
v, _ := url.ParseValues("q=1")
v["q"] = nil
v.Encode()
and expected the result to be empty.
Even if no existing tests break, it's a special-case that will likely trip people up for years to come.
If it doesn't trip up the entire corpus of Go code written in the past 5 years, then I find this statement to fall flat.
Finally:
Round-tripping is not a justification. In general there are multiple possible URLs mapping to any particular representation.
This isn't a black and white issue. It isn't like all round tripping is possible, or none of it is. What I am asking for in this issue is to add a class of url param strings that previously were not round trippable, which subsequently can be. If I wanted 100% fidelity of query strings, I would have sent Option 5 out for review; I didn't.
It doesn't appear the Go team feels this is an issue. I'm closing this.
Why does this simple issue result in a huge discussion with no result at all?
We link to a 3rd party URL that works only without the =
behind the param key. Obviously, it is very annoying of that server to do that but that's just our loss.
Works: http://ad.zanox.com/ppc/?44455365C78013321
Doesn't work: http://ad.zanox.com/ppc/?44455365C78013321=
We add more parameters to the URL, so it's not as simple as to just slice off the last character. Back to string search+replace then. It's not important.
@nuarhu It's not really useful to continue discussion on an issue closed so long ago. I recommend that you bring this to a forum; see https://golang.org/wiki/Questions. Thanks.
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
go version
)?1.9
https://play.golang.org/p/FWtvu_9XEM
The
Values.Encode
function adds an unnecessary=
to the end of kv pairs, even if the original value was empty. When using flag style url query values, this is kind of annoying because it is supposed to be either present or absent.It would be nice to interpret a url.Values with nil value slice as a flag, and skip the
=
for encoding.