Closed willnorris closed 1 year ago
Ideally if one wants to put something with path reserved characters into the path, they ought to be escaped before cleaning, I think? That is, we don't really want the components of the embedded URL to behave like part of the enclosing path, but to be a single element, right?
I imagine we could do that escaping proactively at creation time without breaking anything, right?
Maybe we expose a template func that could enforce this behavior when desired? go/ladder => http://ladder/{{ raw .Path }}
or whatever. Just in case we're not convinced this is safe to do as the default behavior. I'm still not 100% sure that would solve the problem if the cleaning is happening at a later step, which I think is what you're alluding to also.
oh, it's not happening later... it's happening earlier. We're using http.DefaultServeMux
in our main
func, and I believe that's doing the URL cleaning before it even gets passed off to our handler.
Well for this example I'm assuming we'd encode the string the user entered as the label, so http://blah/blee
becomes http:%2F%2Fblah%2Fblee
and will subsequently be unaffected by path cleaning. It does mean we'd need to unescape during the resolve step, but I think that would be transparent to the user?
in this case, the URL that's getting munged is not part of the defined go link... it's the extra path that is being added by the user at resolution time. I think r.URL.RawPath
would likely still have the original path, so we could just expose that to templates. Then you could define a golink of go/ladder => http://ladder/{{ .RawPath }}
oh, it's not happening later... it's happening earlier. We're using
http.DefaultServeMux
in ourmain
func, and I believe that's doing the URL cleaning before it even gets passed off to our handler.
Hm. I guess that means the browser isn't encoding it, which I suppose isn't entirely crazy since it could have been a path but for the scheme label. 🤔
I mean, it is just a path, and there's nothing for the browser to encode. http://go/ladder/http://bbc.com/
is a perfectly cromulent URL.. the byte sequence http:
might be part of a scheme label if it were at the beginning of the URL, but here it's just all valid path characters. And http.ServeMux
is perfectly within standard operating procedure to collapse the slashes. We just don't want it to in this case.
The new go1.22 changes for ServeMux have certainly made the handling logic more complicated (in order to support matching on method), but it still always cleans URLs. We either have to use a different mux implementation, or none at all (which wouldn't be awful, I guess... we don't have that many endpoints).
Start with the go link
go/ladder => http://ladder/
. When appending a full URL to the extra path, the//
gets collapsed down to a single/
:This isn't technically wrong, but for a project like ladder this is undesirable.
For imageproxy, I dealt with a similar issue by switching to gorilla's Mux, and disabling path cleaning. I'd rather not do that here if we can keep from it. But if I remember correctly, the default
http.ServeMux
always cleans paths and doesn't provide the ability to disable that. There's work happening onhttp.ServeMux
in go1.22, so I can look and see if any of that would be helpful here.