golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.29k stars 17.7k forks source link

net/http: enhanced ServeMux routing #61410

Closed jba closed 1 year ago

jba commented 1 year ago

10 October 2023: Updated to clarify escaping: both paths and patterns are unescaped segment by segment, not as a whole. We found during implementation that this gives the behavior we would expect.


7 August 2023: updated with two changes:


We propose to expand the standard HTTP mux's capabilities by adding two features: distinguishing requests based on HTTP method (GET, POST, ...) and supporting wildcards in the matched paths.

See the top post of this discussion for background and motivation.

Proposed Changes

Methods

A pattern can start with an optional method followed by a space, as in GET /codesearch or GET codesearch.google.com/. A pattern with a method is used only to match requests with that method, with one exception: the method GET also matches HEAD. It is possible to have the same path pattern registered with different methods:

GET /foo
POST /foo

Wildcards

A pattern can include wildcard path elements of the form {name} or {name...}. For example, /b/{bucket}/o/{objectname...}. The name must be a valid Go identifier; that is, it must fully match the regular expression [_\pL][_\pL\p{Nd}]*.

These wildcards must be full path elements, meaning they must be preceded by a slash and followed by either a slash or the end of the string. For example, /b_{bucket} is not a valid pattern. Cases like these can be resolved by additional logic in the handler itself. Here, one can write /{bucketlink} and parse the actual bucket name from the value of bucketlink. Alternatively, using other routers will continue to be a good choice.

Normally a wildcard matches only a single path element, ending at the next literal slash (not %2F) in the request URL. If the ... is present, then the wildcard matches the remainder of the URL path, including slashes. (Therefore it is invalid for a ... wildcard to appear anywhere but at the end of a pattern.) Although wildcard matches occur against the escaped path, wildcard values are unescaped. For example, if a wildcard matches a%2Fb, its value is a/b.

There is one last, special wildcard: {$} matches only the end of the URL, allowing writing a pattern that ends in slash but does not match all extensions of that path. For example, the pattern /{$} matches the root page / but (unlike the pattern / today) does not match a request for /anythingelse.

Precedence

There is a single precedence rule: if two patterns overlap (have some requests in common), then the more specific pattern takes precedence. A pattern P1 is more specific than P2 if P1 matches a (strict) subset of P2’s requests; that is, if P2 matches all the requests of P1 and more. If neither is more specific, then the patterns conflict.

There is one exception to this rule, for backwards compatibility: if two patterns would otherwise conflict and one has a host while the other does not, then the pattern with the host takes precedence.

These Venn diagrams illustrate the relationships between two patterns P1 and P2 in terms of the requests they match:

relationships between two patterns

Here are some examples where one pattern is more specific than another:

In contrast to the last example, the patterns /b/{bucket}/{verb}/default and /b/{bucket}/o/{noun} conflict with each other:

Using specificity for matching is easy to describe and preserves the order-independence of the original ServeMux patterns. But it can be hard to see at a glance which of two patterns is the more specific, or why two patterns conflict. For that reason, the panic messages that are generated when conflicting patterns are registered will demonstrate the conflict by providing example paths, as in the previous paragraph.

The reference implementation for this proposal includes a DescribeRelationship method that explains how two patterns are related. That method is not a part of the proposal, but can help in understanding it. You can use it in the playground.

More Examples

This section illustrates the precedence rule for a complete set of routing patterns.

Say the following patterns are registered:

/item/
POST /item/{user}
/item/{user}
/item/{user}/{id}
/item/{$}
POST alt.com/item/{user}

In the examples that follow, the host in the request is example.com and the method is GET unless otherwise specified.

  1. “/item/jba” matches “/item/{user}”. The pattern "/item/" also matches, but "/item/{user}" is more specific.
  2. A POST to “/item/jba” matches “POST /item/{user}” because that pattern is more specific than "/item/{user}" due to its explicit method.
  3. A POST to “/item/jba/17” matches “/item/{user}/{id}”. As in the first case, the only other candidate is the less specific "/item/".
  4. “/item/” matches “/item/{$}” because it is more specific than "/item/".
  5. “/item/jba/17/line2” matches “/item/”. Patterns that end in a slash match entire subtrees, and no other more specific pattern matches.
  6. A POST request with host “alt.com” and path “/item/jba” matches “POST alt.com/item/{user}". That pattern is more specific than “POST /item/{user}” because it has a host.
  7. A GET request with host “alt.com” and path “/item/jba” matches “/item/{user}”. Although the pattern with a host is more specific, in this case it doesn’t match, because it specifies a different method.

API

To support this API, the net/http package adds two new methods to Request:

package http

func (*Request) PathValue(wildcardName string) string
func (*Request) SetPathValue(name, value string)

PathValue returns the part of the path associated with the wildcard in the matching pattern, or the empty string if there was no such wildcard in the matching pattern. (Note that a successful match can also be empty, for a "..." wildcard.)

SetPathValue sets the value of name to value, so that subsequent calls to PathValue(name) will return value.

Response Codes

If no pattern matches a request, ServeMux typically serves a 404 (Not Found). But if there is a pattern that matches with a different method, then it serves a 405 (Method Not Allowed) instead. This is not a breaking change, since patterns with methods did not previously exist.

Backwards Compatibility

As part of this proposal, we would change the way that ServeMux matches paths to use the escaped path (fixing #21955). That means that slashes and braces in an incoming URL would be escaped and so would not affect matching. We will provide the GODEBUG setting httpmuxgo121=1 to enable the old behavior.

More precisely: both patterns and paths are unescaped segment by segment. For example, "/%2F/%61", whether it is a pattern or an incoming path to be matched, is treated as having two segments containing "/" and "a". This is a breaking change for both patterns, which were not unescaped at all, and paths, which were unescaped in their entirety.

Performance

There are two situations where questions of performance arise: matching requests, and detecting conflicts during registration.

The reference implementation for this proposal matches requests about as fast as the current ServeMux on Julien Schmidt’s static benchmark. Faster routers exist, but there doesn't seem to be a good reason to try to match their speed. Evidence that routing time is not important comes from gorilla/mux, which is still quite popular despite being unmaintained until recently, and about 30 times slower than the standard ServeMux.

Using the specificity precedence rule, detecting conflicts when a pattern is registered seems to require checking all previously registered patterns in general. This makes registering a set of patterns quadratic in the worst case. Indexing the patterns as they are registered can significantly speed up the common case. See this comment for details. We would like to collect examples of large pattern sets (in the thousands of patterns) so we can make sure our indexing scheme works well on them.

AndrewHarrisSPU commented 1 year ago

Really enthusiastic about this, seems like an excellent improvement to me!

To support this API, the net/http package adds a new method to Request:

package http

func (*Request) PathValue(wildcardName string) string

Is the status of setting PathValues still as mentioned here: https://github.com/golang/go/discussions/60227#discussioncomment-6132110 - omitted for minimal proposal?

(I convinced myself something like a one-shot Request.ParsePathValues(pattern string), slicing the clean URL path, was the least hazard-free thing. I guess I'd be surprised if Handlers start to depend on PathValue but there was no way to populate independently of ServeMux.)

jba commented 1 year ago

@AndrewHarrisSPU, yes, I deliberately omitted a way to set path values to be minimal. But I do see what you mean about having no way to set it. At the very least, tests should be able to set path values on a request without going through the matching process.

I don't think that exposing the parsing logic should be done as a method on Request; I'm not sure it should be done at all.

I would go with something simpler. Perhaps:

// SetPathValue sets key to value, so that subsequent calls to r.PathValue(key) return value.
func (r *Request) SetPathValue(key, value string)
flibustenet commented 1 year ago

Could you expose the pro and con about adding SetPathValue in the current proposal ?

jba commented 1 year ago

@flibustenet, the pros of being able to set values are that it would be easier to write tests, and it would make it possible for middleware to transmit information via the path values.

That second pro is also a con: once we allow setting, then we have introduced this bag of key-value pairs to http.Request, which anyone can use for whatever they want. Middleware could overwrite path wildcards that a handler depended on. Or the storage could be co-opted for something completely unrelated to the path, much as people abuse contexts (only here it's worse, since setting a path value is a destructive change, so it is visible to everyone who has a pointer to the same request).

benhoyt commented 1 year ago

As someone who's played around with various Go routing techniques, I really like this, and think it'll more or less eliminate the need for third-party routers, except for very specialized cases.

I've tested it using the reference implementation in my go-routing suite, and it was very easy to use. I just copied the chi version and tweaked it a bit. Lack of regex matching like [0-9]+ means you have to error-check any strconv.Atoi calls for integer path values, but that's not a big deal. I much prefer the simplicity of just handling strings rather than fancy regex and type features.

A few comments/questions:

  1. One thing I noticed is that old versions of ServeMux accept patterns like HandleFunc("GET /foo", ...). This would mean code written for the enhanced ServeMux would compile and appear to run fine on older versions of Go, but of course the routes wouldn't match anything. I think that's a bit error-prone, and wonder if there's a workaround. Possibilities are:
    • Instead of overloading the existing methods, use new methods like HandleMatch and HandleMatchFunc. Obviously a fairly big departure from the current proposal, but at least clear what's old vs new.
    • Update old versions of Go (maybe the last couple of versions) to panic on patterns that have a space in them, or perhaps patterns that use any of the enhanced features. This might be too much of a breaking change.
  2. How would you customise the 405 response? For example, for API servers that need to always return JSON responses. The 404 handler is sort of easy to override with a catch-all / pattern. But for 405 it's harder. You'd have to wrap the ResponseWriter in something that recorded the status code and add some middleware that checked that -- quite tricky to get right.
  3. {$} is a bit awkward. Is $ by itself ever used in URLs, and would it be backwards-incompatible just to use it directly?
  4. Regarding SetPathValue: I agree something like this would be good for testing. Perhaps instead of being a method on Request, it could use httptest.NewRequestWithPathValues or similar, so it's clearly a function marked for testing and not general use? I'm not sure how that would be implemented in terms of package-internals though.
francislavoie commented 1 year ago

In Caddy, we have the concept of request matchers. I'm seeing isn't covered by this proposal is how to route requests based on headers. For example, something commonly done is multiplexing websockets with regular HTTP traffic based on the presence of Connection: Upgrade and Upgrade: websocket regardless of the path. What's the recommendation for routing by header?

Another learning from Caddy is we decided to go with "path matching is exact by default" and we use * as a fast suffix match, so /foo/* matches /foo/ and /foo/bar but not /foo, and /foo* matches /foo and /foo/bar and /foobar. Two paths can be defined for a route like path /foo /foo/* to disallow /foobar but still work without the trailing slash. We went with this approach because it gives ultimate flexibility by having routes that match exactly one URL nothing else.

Finally, there's an experimental https://developer.mozilla.org/en-US/docs/Web/API/URL_Pattern_API which has a lot of great ideas for matching URL patterns. I'd suggest considering some of the syntax from there, for example ? making the trailing slash optional.

muirdm commented 1 year ago

Although wildcard matches occur against the escaped path, wildcard values are unescaped. For example, if a wildcard matches a%2Fb, its value is a/b.

Does that apply for "..." wildcards as well? If so, would that mean the handler couldn't differentiate literal slashes from escaped slashes in "..." wildcard values?

jba commented 1 year ago

@muirdm: Yes, in a "..." wildcard you couldn't tell the difference between "a%2Fb" and "a/b". If a handler cared it would have to look at Request.EscapedPath.

jba commented 1 year ago

@francislavoie: Thanks for all that info.

The Caddy routing framework seems very powerful. But I think routing by headers is out of scope for this proposal. Same for many of the features Caddy routing and the URL pattern API you link to. I think we expect that people will just write code for all those cases. By the usual Go standards, this proposal is already a big change.

jba commented 1 year ago

@benhoyt, thanks for playing around with this.

  1. I don't know what to do about older versions of ServeMux.Handle accepting "GET /". We can't change the behavior in older Go versions. If we added new methods we could make Handle panic in the new version, which I guess would solve the problem for code that was written in the new version but ended up being compiled with the old version. But that's probably rare. Maybe a vet check would make sense.

  2. Again, no good answer to how to customize the 405. You could write a method-less pattern for each pattern that had a method, but that's not very satisfactory. But really, this is a general problem with the Handler signature: you can't find the status code without writing a ResponseWriter wrapper. That's probably pretty common—I know I've written one or two—so maybe it's not a big deal to have to use it to customize the 405 response.

  3. $ is not escaped in URLs, so we can't use it as a metacharacter by itself.

  4. I like the idea that only httptest can set the path values. We can always get around the visibility issues with an internal package, so I'm not worried about that.

rsc commented 1 year ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

flibustenet commented 1 year ago

I see two others usages of SetPathValue

To can start with standard router and switch to an improved one if needed without changing all the methods to retrieve the PathValue in the handlers. @jba you said that it's anyway a big work to switch the routers but I don't think so, I every time begin by standard router and only switch when needed without so much works, and we'll do even more with this proposal.

There is also the case where we want to route from a handler to an other handler or middleware and adjust the PathValues. For example with a complex parsing /api/{complex...} we'll want to expand complex in some PathValues and like to use them with the same method PathValue(). Or we'll need to use request.Context bag and it's worse.

It will prevent the temptation to make mux overcomplicated when we can improve it easily by adding/setting PathValues. We already do this with Form.

treuherz commented 1 year ago

One small addition to this I would really like to see is the option to read back from a request which pattern it was matched with. We find this functionality in other frameworks really helpful for logging and metrics, and for resolving ambiguity in how a request reached a handler. Where it doesn't exist, we add it by using a middleware to add a request tag string to the context, but this causes a stutter wherever it's declared and pollutes logical concerns (routing) with cross-cutting ones (observability) in a way that makes otherwise-simple routing code harder to read and write.

Currently this doesn't exist in net/http, but could sometimes be fudged if a handler separately knew the path prefix it was registered under. With this change the template could be even more different from the URL's Path, so being able to explicitly pull it out becomes more important.

My proposed API would be only adding a Pattern string field to http.Request, which would be set by the router. I realise this brings up similar potential concerns about providing more places for data to be smuggled into requests, but this placement would match most of the information already on a Request, and would be able to be set by other frameworks as well.

Similar functionality exists already in gorilla/mux and chi.

I realise this change is kind of orthogonal to the routing additions, although I think it is made more pressing by them. Happy to open this as a separate issue if that's preferred.

jba commented 1 year ago

@flibustenet, good point about Request.Form already being a bag of values. In fact, there are several others as well: PostForm, MultipartForm, Header and Trailer. Given those, SetPathValue seems relatively benign.

jba commented 1 year ago

@treuherz, I see the usefulness for retrieving the pattern, but it feels out of scope for this issue. I'd suggest a separate proposal.

iamdlfl commented 1 year ago

Regarding response codes, would a 405 response automatically include the "Allow" header with appropriate methods?

prochac commented 1 year ago

What is the state of the request's method?

package http

func (*Request) PathValue(wildcardName string) string

I can see that in reference implementation it's part of the ServeMux

package muxpatterns

func (mux *ServeMux) PathValue(r *http.Request, name string) string

IMO, it should stay this way.

Alternatively, if the method is still intended to be a request method, this proposal should make a commitment to allow setting the path values by 3rd party routers.

jba commented 1 year ago

It's part of ServeMux in the reference implementation because I couldn't see how to modify http.Request without making muxpatterns unusable with net/http.

If it stayed there then every handler would have to have some way to access its ServeMux. But handlers only take ResponseWriters and Requests.

We are leaning to adding a Set method. As was pointed out, it wouldn't be the only bag of values in Request.

jba commented 1 year ago

@iamdlfl, apparently the spec for a 405 says we MUST do it, so we will. I'm just not sure what to put there if there is a matching pattern that has no method. I guess all the valid HTTP methods?

iamdlfl commented 1 year ago

Sounds good. And if someone doesn't define the optional method when registering the route, I don't think we'll have to worry about responding with a 405, right? If I'm understanding right, it will respond normally to any method type.

jba commented 1 year ago

Good point!

jimmyfrasche commented 1 year ago

Some common related features are naming patterns and taking a pattern and its arguments and outputting a url. This typically gets bundled as something you pass to templates so you can do something like {{ url "user_page" user.id }} so the template doesn't need to be updated if the named pattern changes from /u/{id} to /user/{id} or the like. I don't think this should necessarily be included per se but it would be nice if now or in a later proposal there were enough added so that things of this nature could be built. Sorry if this has come up I haven't been able to pay a great deal of attention to this proposal's discussion.

nate-anderson commented 1 year ago

I like the idea of richer routing capabilities in the standard library, but I don't love how overloading Handle and HandleFunc to just accept a new string format makes Go programs using the new syntax backwards-incompatible in a way that can't be detected at compile time.

I like @benhoyt's suggestion of infixing Match into new method names, but I would prefer to also see the method+space pattern prefix omitted for two reasons:

  1. Sometimes a route may wish to accept multiple (but not all) methods - does this string format allow this?
  2. We have the HTTP methods conveniently defined as constants in net/http, but they can't be used here without some ugly and awkward string concatenation.

I would suggest these method signatures:

func (s *ServeMux) HandleMatch(pattern string, handler http.Handler, methods ...string)

func (s *ServeMux) HandleMatchFunc(pattern string, handler http.HandlerFunc, methods ...string)

If no methods are passed in the variadic arg, behave exactly like HandleFunc with the addition of wildcard matching, otherwise return a 405 if the request method doesn't match one in the methods slice. The method(s) being a dedicated but optional argument is self-documenting, IMO, compared to putting all that information into a single string.

This will feel very similar to the .Methods() chaining used commonly in gorilla/mux, which itself could be another approach to consider here.

benhoyt commented 1 year ago

@nate-anderson Having thought about it further, I think I could "go" either way on new method names vs reusing the existing Handle and HandleFunc. The advantage of reusing the existing ones is that, moving forward, there's just one router style to document and explain. If we have two sets of methods, we have to have two sets of documentation, old-style and new.

I also wasn't sure about the method-space-pattern. However, that too has grown on me. For one, I definitely prefer the method first, because that's you almost always see it, I suppose because that's the syntax in the HTTP protocol itself (GET /path HTTP/1.1). So you could say method-space-pattern has a long and rich history. :-)

Your suggestion nicely solves the multiple-method problem, but I find that methods last is a lot harder to read, especially when you have an inline handler function. For example, the method is kind of hidden of at the end when using gorilla/mux:

router.HandleFunc("/users/{user}/settings", func(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "user %s", mux.Vars(r)["user"])
}).Methods("GET")

In my experience, using multiple methods is rare enough to make it not worthwhile to have special handling for. It's easy to duplicate the route for both GET and POST, or add a one-liner getAndPost helper if you're doing it often. Relatedly, maybe we should make the mux respond automatically to HEAD when you register GET? Chi and Gorilla don't do this, but Pat and some others do.

gempir commented 1 year ago

I think automatically registering HEAD in such a low level library is not right. I would expect that from something like nginx, caddy or the likes but not from the standard library of my programming language.

I like Nate's suggestion, only issue is with it is readability is decreased when the method is so far at the end.

Another Suggestion that might be interesting and very close to the initial proposal:

router.HandleMatchFunc(http.MethodGet, "/users/{user}/settings", func(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "user %s", r.PathValue("user"))
})

Making use of the already existing constants or if you like just write "GET" which looks almost the same as the initial proposal.

doggedOwl commented 1 year ago

These functionality changes are very welcome but I still think this is a missed opportunity to rapresent the Protocol as it is. Mixing the method with the URI (in a string rapresentation) in a generic Handle Method is not only not ergonomic, but also error-prone. HTTP does not have a generic handle Method. HTTP defines several well-defined Methods/verbs and the Go implemententation should reflect that with a dedicated method for each one.

I understand tring to minimise the proliferation of methods but not at the expense of clarity, Eitherwise every library can do with a generic Handle/Do/Run method.

@gempir

I think automatically registering HEAD in such a low level library is not right

I think this autoregistration comes naturally from the RFC:

The HEAD method is identical to GET except that the server MUST NOT send content in the response.

This is just one example where a dedicated Method would help. The dedicated method router.Head would make sure that no Content is sent with the response, just the headers. Dedicated Methods would also solve the redability Issue of the variant with the method at the end.

router.Get("/users/{user}/settings", func(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "user %s", r.PathValue("user"))
})
nate-anderson commented 1 year ago

@doggedOwl's suggestion to add router methods for each HTTP method is the most familiar to me having used Chi (and basically every router in the Node.js world) extensively, and arguably also the most concise API being discussed here. Is there genuinely resistance to this approach just because of the number of methods being added? This would leave Handle and HandleFunc method signatures untouched (and those would remain the correct way to handle my niche multiple methods concern above). It also pushes backwards incompatibility to a compile time error instead of runtime, and IMO is the most self-documenting approach, even if it requires a larger API change.

jba commented 1 year ago

@jimmyfrasche, I could see how something like Expand("/user/{id}", map[string]any{"id": name}) would be useful in writing the url template function you allude to. But I don't think it fits in this proposal.

jimmyfrasche commented 1 year ago

@jba that's part of the functionality but requires duplicating the pattern. Generally there's a way to name patterns, get the pattern by name then expand it. So something like

p, _ := mux.GetHandlerPatternNamed("users")
url, _ := p.Expand(map[string]any{"id": name})

That could be handled by third parties that store a map of names to patterns before registering a handler but those wouldn't interoperate well. If named patterns were allowed now even though don't do anything at this point the only things to add later would be a way to get at those named patterns and a way to use them to make the expansion and everything could be built on top of those three pieces. Maybe that whole deal should go into a later proposal but I just want to make sure there's nothing happening here that would preclude doing that in the future.

earthboundkid commented 1 year ago

I think we don’t have to worry about false backward compatibility because r.PathValue will trigger a compiler error in old Go versions.

jba commented 1 year ago

@jimmyfrasche, nothing here precludes that.

jba commented 1 year ago

Here are my thoughts on alternatives to putting the method name in the pattern.

Anything that takes a function followed by variadic args reads badly when a function literal is used, as @benhoyt pointed out above.

Having the method as a first argument, as in

HandleMethod(method, pattern string, h Handler)

has some precedent in NewRequest(method, url, body). For those four new symbols—two ServeMux methods and two top-level functions—what do we get?

What do we lose with these two new methods?

I don't see a compelling argument either way. Let's keep discussing.

Putting the name of the method in the function, like HandleGet, also has precedent in the Get, Head and Post functions. If we define functions for those three methods, we are talking about 12 new symbols: 3 HTTP methods x (2 ServeMux methods + 2 top-level functions). And we'd still need something like the above HandleMethod to support other HTTP methods; that's another 4 symbols. That seems like a lot of new API for minimal benefit.

jba commented 1 year ago

@casimcdaniels:

AndrewHarrisSPU commented 1 year ago

I don't see a compelling argument either way. Let's keep discussing.

A minor point: the surface area for vetting "METHOD path" pattern strings is smaller than fmt formatting.

jba commented 1 year ago

@AndrewHarrisSPU I think that's actually a good point. There are 9 net/http method constants, all simple English words. I'm not sure how much we should worry about misspelling GET or POST or even PATCH or TRACE.

gophun commented 1 year ago

Not sure if this is helpful, but these were some routing libs I've used in the past;

They all look quite baroque to me if I'm being honest.

lucianboboc commented 1 year ago

A dedicated router function per HTTP method looks very good, similar to the above example for other languages:

hherman1 commented 1 year ago

I can’t see much benefit to the added api from adding .get, .post and so on. Doesn’t seem hard, and seems straightforward, to just put GET in the pattern

lucianboboc commented 1 year ago

I can’t see much benefit to the added api from adding .get, .post and so on. Doesn’t seem hard, and seems straightforward, to just put GET in the pattern

Dedicated functions are type safe, the http method in the pattern is more error prone. Also the method and the pattern as 2 different things and are better off to stay separate, together they look horrible 😅

hherman1 commented 1 year ago

I’m not convinced that errors from the stringly typed solution will be a significant problem, and I don’t think it looks horrible

earthboundkid commented 1 year ago

Can go vet check for misuse?

jba commented 1 year ago

@carlmjohnson Yes, if the string is constant.

javea7171 commented 1 year ago

Please don't use strings, it's a half way solution. Look at the most widely used router libraries, this is what users want.

bign8 commented 1 year ago

Given the sentiment of the comments so far, would separating the improvements into distinct discussions help? My understanding of this proposal has 3 improvements (please correct me if I'm missing any):

  1. Path Variables
  2. Method Routing
  3. Domain Routing

Warning, opinions follow

Path Variables is the real win here for me. Parsing the URI manually always felt error prone to me. The proposed PathValue method feels less clunky than doing a strings.Split on the URL.Path.

For Method Routing, I'm not against the proposal. But given the negative sentiment expressed above, I'd consider dropping it in favor of having Path Variables. I can switch on the request's Method property in my handlers easily enough.

Finally, Domain Routing, which I also don't mind the proposal as worded. But if folks don't like Method Routing, I would assume this also rubs folks the wrong way. If needed, request's Host property could be used to determine the proper logic within a handler.

Anyway, I hope the negative sentiment with the last two improvements doesn't "throw the baby out with the bathwater" on this proposal. Just my 2-cents after reading the discussion. Hope it helps. Thanks for organizing the proposal @jba and other contributers. Happy coding!

doggedOwl commented 1 year ago

There is no Negative sentiment about method routing. It's the most useful change here. The discussion is on how it should be expressed not on whether it should be implemented. Practically while we may have routes without variables we (almost) never need a generic route on multiple methods.

earthboundkid commented 1 year ago

@bign8

The current Go router supports domain routing, so there's no way to drop it. :-) I personally think it shouldn't have been added because if you do http.HandleFunc("path", h) instead of http.HandleFunc("/path", h) by mistake, it won't work because it's trying to route the domain path. In practice, I don't see those bugs, I guess because even minimal testing would flush them out.


In terms of the string versus method debate, I agree that probably no one would have chosen this design from a clean sheet of paper, but it's acceptable and I think bugs will be unlikely in practice. I think we should add string based routing now and if there's serious outcry about the ugliness, we can always add method based routing later. Method routing could also be stricter about / without needing the /{$} hack.

deankarn commented 1 year ago

I have seem a lot of good conversation around this, however I strongly believe that this should NOT be added to the STD library and believe people are getting a little tunnel visioned about it.

Issues

This is a fallacious argument; Just because it's popular does not mean speed isn't important and in fact many do not use because of its performance.

There are many other things I don't like about this proposal but these are the most glaring to me. I do not wish us to repeat what I consider mistakes of the past and why I am commenting at all such as:

Proposal

So here is what I would propose if there is an appetite to have/allow routing in the std library, super high level as I do not have time to write and RFC or formal proposal:

Keep routing out of the STD library altogether and allow the community to create their own that suits their needs regexes, pattern matching etc wise but add to the STD library two things:

  1. A way to hook in custom routers/routing into the existing ServeMux and extraction of it for custom logic & helpers through the http.Request.
  2. Add a public interface/generic constraint interface that the community can rally around for these routers to both register one and hook into it from the http.Request.

Doing this would allow all the behaviour discussed here without any breaking changes and more + encourage the community to continue to build and innovate new things that everyone can use & share instead of trying to bring it all in-house and not being able to make everyone happy.

Misc

As a complete side note I believe there are far more important things that the Go team should/could be focusing on to enhance the Go language at a much more fundamental level, and that this should be far down on the list of priorities as it's a non-issue IMO. Going to be linking some things I've attempted to solve that I believe should be in the std library, by no means a self promotion but merely examples, such as:

jba commented 1 year ago

@deankarn Thanks for your thoughtful post.

Pattern matching on the most specific pattern when routes overlap may seem like a good idea, however I'd argue that it simply shouldn't be allowed.

The current ServeMux already does this, so we can't disallow it.

This is a footgun just waiting to go off as someone can go into existing code, add a new route and break existing functionality of another without even knowing about it.

Isn't that true of any scheme that allows overlapping patterns? Are you advocating for a router that disallows overlaps? That's not unreasonable, but I think it's the minority opinion.

Just because it's popular does not mean speed isn't important and in fact many do not use because of its performance.

I meant that routing time is not as important as the fuss over it would lead one to believe. That gorilla/mux is both slow and popular is evidence for that claim. Also, no one has yet shown me examples where speed in excess of http.ServeMux is needed. Please point me to some.

A way to hook in custom routers/routing into the existing ServeMux and extraction of it for custom logic & helpers through the http.Request.

There isn't much to ServeMux besides routing. What would those hooks look like? What existing ServeMux behaviors would be unchanged? How would the situation be better than it is now, where people write alternative routers that also implement http.Handler?

Extraction of arbitrary routing data from http.Request would look something like map[string]any. That was part of the original proposal, but was rejected as being too clumsy.

I believe there are far more important things that the Go team should/could be focusing on to enhance the Go language

Please file proposals for the items on your list that aren't already proposals, and advocate for them in the appropriate places. Your list is off-topic here and, more importantly for your goals, will not get a large audience.

deankarn commented 1 year ago

The current ServeMux already does this, so we can't disallow it.

You are proposing adding new breaking functionality and was suggested hiding it behind an opt-out flag giving the ability to preserve the original behaviour of ServeMux which allowed overlapping pattern matching. So this logic is new and is not beholden to the old way ServeMux allowed it and therefore is not a limiting factor in any way to the new additions being proposed.

Isn't that true of any scheme that allows overlapping patterns? Are you advocating for a router that disallows overlaps?

No not any schema allows overlapping patterns in the dynamic pattern matching portions. Yes I would strongly advocate to disallow overlapping dynamic & non dynamic patterns portions.

That's not unreasonable, but I think it's the minority opinion.

Why is this true? Going to flip this question around. Why would anyone want to allow overlaps of dynamically matched patterns and non dynamic portions as it's dangerous? Here's a super contrived example only to demonstrate why it's a dangerous footgun

/restaurant/{id}/{menu}
/restaurant/{id}/holidays // added at a later data than first route and not side-by-side in code

In this case nobody would be able to navigate to the holidays menu for the restaurant. And worse there are no errors warning or bells so site remains behaviourally broken until someone notices, if ever. Again this is a little silly example but could happen to much more important pieces of code.

I meant that routing time is not as important as the fuss over it would lead one to believe. That gorilla/mux is both slow and popular is evidence for that claim. Also, no one has yet shown me examples where speed in excess of http.ServeMux is needed. Please point me to some.

Again popularity is not evidence in any way to support this claim. Good marketing and first to the table scemantics are big contributing factors to what looks like popularity, yet have no bearing on performance as evidence of a thing.

I can't speak for all but I live in the BigData world and when dealing/processing that kind of volume of data speed is of the utmost importance and often translates directly into 💰money lost or saved. Again I'm going to flip this around as it's your proposal, prove to me that speed isn't important; because it is to me and I'm sure a lot of other people dealing with data at scale.

There isn't much to ServeMux besides routing. What would those hooks look like? What existing ServeMux behaviors would be unchanged? How would the situation be better than it is now, where people write alternative routers that also implement http.Handler?

Hooks could look like so by defining a common interface for the community to rally around, naming verbosely for clarity only not proposing these names necessarily:

    type Router interface {
        // Serve returns an http.Handler to serve all incoming requests. This will also set path params on the Requests context
        // however the router wants for later extraction.
        Serve() http.Handler

        // PathParams is a simple function that allows extraction of path params set in the `Server()` method, if any.
        PathParams(ctx context.Context) map[string]string
    }

    http.ServeMux.RegisterCustomRouting(router Router)

    http.Request.PathParams() map[string]string

    type Request struct {
        ...
        paramsFn PathParams(context.Context) map[string]string // From the registered router
        ...
    }

    func (r *Request) PathParams() map[string]string {
        return r.paramsFn(r.Context())
    }

How would it be better than it is now? It would allow the community to pivot around the defined above Router interface allowing it to be used with the STD lib but allowing the community to swap in whatever routing that specifically suits their specific use case. An example is already present you don't think overlapping pattern is a big deal, I do, and we're unlikely to ever see eye-to eye on that point and that's the beauty of doing it this way; we don't have to. People will be able to have their cake and eat it too.

Also because ServeMux allows registering of these routers already today through an http.Handler function I'm still not convinced anything like this is needed as this proposal is trying to solve a non-issue. But if something has to be added to the STD library let it be this instead of trying to make something that is not possible to cover all use cases.

Extraction of arbitrary routing data from http.Request would look something like map[string]any. That was part of the original proposal, but was rejected as being too clumsy.

map[string]any is a bad idea because of the any, it's completely unnecessary. Path params are strings until some handler logic parses/uses them otherwise. The rest of the http library works this way also, just look at url.Values which is really a map[string][]string; there is no reason to parse specific data types when doing pattern matching of them and trying to do so is conflating two very separate things, routing and parsing/processing of the request.

There will 100% be a need for the community to dynamically parse path params, rejected in the original proposal or not, this would seem a fundamental piece of the puzzle that's trying to be swept under the rug but in fact has great bearing on the direction discussions may take.

Please file proposals for the items on your list that aren't already proposals, and advocate for them in the appropriate places. Your list is off-topic here and, more importantly for your goals, will not get a large audience.

I have. Maybe the list is a little off topic, but served to demonstrate that there are other things where the issue to be solved is much clearer. I don't understand what this proposal is trying to solve. What is the at-large problem this is intended to solve that doesn't already exist within the community? Everyone seems to be talking about the how and not asking about the why?

AndrewHarrisSPU commented 1 year ago

I don't understand what this proposal is trying to solve. What is the at-large problem this is intended to solve that doesn't already exist within the community?

Independently of ServeMux, with respect to PathValues, I'm convinced looking at the spectrum of community solutions supports a conclusion that things could be improved by extending the http.Request. Some community solutions are doping context.Context with path values, some are giving their own alternatives to standard library requests and handlers (confusingly, some named Context). Some are doing both, some are doing neither in non-performant ways, etc.

I'm reading the intent of the proposal as keeping ServeMux as a statically loaded component, not something that has to be everything for everyone - it's just that SeverMux will depend on path variables ... I think there's agreement that path variables should be more universal?