Open Snaipe opened 2 years ago
I'll take look into it. Currently router path matching relies heavily on slashes when it traverses routes tree looking for matching node. maybe there is something we can do but echo.Router.Find()
is most complex and performance sensitive part of whole repository so we need to be careful here.
I've had to implement a small service implementing parts of the docker registry API; the issue is that multiple endpoints look like
/v2/<repo>/manifests/<reference>
, where<repo>
may contain unescaped slashes./v2/library/ubuntu/manifests/latest
is thus a valid endpoint to get the manifests of the latest tag for thelibrary/ubuntu
repository.
This looks like is a very specific use case that requires "file system" parts in the middle of the URI. As a slash is a very important part in an URI and the whole router is trimmed for that for higher performance. So this is problematic.
The problem is that naively,
Echo.GET("/v2/:repo/manifests/:reference")
doesn't work (which I expected), andEcho.GET("/v2/*/manifests/:reference")
does not work either, becausereference
does not get recognized. Worse, if you go with the second form, and register another endpoint like"/v2/*/blobs/:digest"
, the second one starts matching paths like"/v2/repo/manifests/foobar"
.
A path like "/v2/*/blobs/:digest"
should be an error for the current router actually, as it will never match the parts after the asterisk.
It'd be great if this worked seamlessly. It doesn't seem crazy to have a greedy matching syntax for parameters that may contain slashes, like
/v2/:*repo/manifests/:reference
-- regular expressions are able to do just that, in fact.
For the current router I do not see too much room for implementing that. This might change with the next major release v5 though.
I've worked around this issue with a .Pre handler that rewrites the URL with path-escaped components based on a set of route-specific regular expressions:
From the rest perspective this is a rather akward resource pattern to route. It seems to simply match static files in a filesystem.
For your use case using a path like "/v2/blobs/:digest/*"
would be a better match and working out of the box.
But as your requirement is different and probably cannot be changed simply using I only see two options:
/v2/myrepo/funny/stuff/manifest/blabla
=> /v2/manifest/blabla/myrepo/funny/stuff
and use /v2/manifest/:repo/*
in the routerc.Param("*")
yourself, this would result in a route of /v2/*
and you will get a path like /myrepo/funny/stuff/manifest/blabla
for further processing in the handler. No requirement for rewriting the URLTo summarize, I'm not sure this needs to be solved in the router. It would add quite some complexity there and probably reduce performance, whereas a workaround using path rewriting is quite simple to implement and trim to your use case. The ability to reverse a route might be affected too.
Guess I need to get a picture on other benefits a special operator like :*
to allow matching slashes would bring.
v5
would help only that much that you can replace router implementation as it has interface now.
I have trouble imaging at the moment rules for this case (priority and backtracking and their relation to slash being one of the router node separator). "greedy" path param (:<name>
or any (*
) param has lower priority than their regular counterpart.
These new "operators" for paths are slippery slope. You could up with your own regexp syntax eventually.
Probably it would best to say that core/default functionality ends at some point and features beyond that should be supported in sense that you should be able to replace core components with extended/different versions of them.
This is going to be unpopular opinion but we should keep core of Echo simple (at least to certain degree). There are alternatives ala Fiber that strives to please everyone as a newcomer. Or Gin which has at least 2x the Echo userbase and larger API with methods for specific features - which all you can do in Echo but do not have dedicated methods.
I think I maybe should investigate how it would be to wrap existing (more flexile) Go http routers to v5
interface and provide some examples.
I agree that having a separate v5 Router implementation seems best. That being said, non-backtracing expression matching is a fairly well-understood problem in the world of regular expressions, that's what nondeterministic finite automata are for.
I got curious and I whipped up a terrible, unoptimized router using a finite automaton. Here's what I get running the benchmark suite in router_test.go:
goos: linux
goarch: amd64
pkg: github.com/labstack/echo/v4/routerfsa
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkRouterStaticRoutes-8 48820 26048 ns/op 7522 B/op 627 allocs/op
BenchmarkRouterStaticRoutesMisses-8 260211 4574 ns/op 768 B/op 64 allocs/op
BenchmarkRouterGitHubAPI-8 19454 60121 ns/op 11331 B/op 944 allocs/op
BenchmarkRouterGitHubAPIMisses-8 275337 4264 ns/op 768 B/op 64 allocs/op
BenchmarkRouterParseAPI-8 167941 6747 ns/op 1248 B/op 104 allocs/op
BenchmarkRouterParseAPIMisses-8 282285 4006 ns/op 768 B/op 64 allocs/op
BenchmarkRouterGooglePlusAPI-8 390236 3001 ns/op 624 B/op 52 allocs/op
BenchmarkRouterGooglePlusAPIMisses-8 273093 4237 ns/op 768 B/op 64 allocs/op
BenchmarkRouterParamsAndAnyAPI-8 166310 7903 ns/op 1152 B/op 96 allocs/op
Compared to the vanilla router:
goos: linux
goarch: amd64
pkg: github.com/labstack/echo/v4
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkRouterStaticRoutes-8 70563 16831 ns/op 0 B/op 0 allocs/op
BenchmarkRouterStaticRoutesMisses-8 2103000 535.5 ns/op 0 B/op 0 allocs/op
BenchmarkRouterGitHubAPI-8 41346 28990 ns/op 0 B/op 0 allocs/op
BenchmarkRouterGitHubAPIMisses-8 1664457 734.7 ns/op 0 B/op 0 allocs/op
BenchmarkRouterParseAPI-8 717120 1544 ns/op 0 B/op 0 allocs/op
BenchmarkRouterParseAPIMisses-8 3287810 378.0 ns/op 0 B/op 0 allocs/op
BenchmarkRouterGooglePlusAPI-8 1000000 1041 ns/op 0 B/op 0 allocs/op
BenchmarkRouterGooglePlusAPIMisses-8 2035212 583.3 ns/op 0 B/op 0 allocs/op
BenchmarkRouterParamsAndAnyAPI-8 505057 2624 ns/op 0 B/op 0 allocs/op
So it's about 3x-10x worse than the normal router, and of course adds a ton of allocations due to my lazy implementation, but it looks like this has a chance to have similar performances as the vanilla router if it gets properly implemented and optimized. I don't have the time to investigate this further right now, but if this seems sound, I can look at providing a real implementation using the v5 router interface if nobody beats me to it.
This looks cool. But I imagine it gets way harder when addition to Github API route tests all those other edge case tests need to pass. I do not think Github routes even touch any of backtracking requirements. Zero allocations is also quite good brainteaser.
For history sake I'll link last backtracking related threads here #1770 and #1754 as those discussions illustrate some of the hidden complexities.
offtopic: one "wild" idea that I had is that we could actually inspect routes at the start of the server and depending of "characteristics" different/specific router implementation could be chosen. For example if you have only static routes you probably do not need backtracking etc at all. But I admit - this is over-engineering. Ability to compose Echo with Router of your own choosing opens most doors and at the same time does not introduce much complexity.
Issue Description
I've had to implement a small service implementing parts of the docker registry API; the issue is that multiple endpoints look like
/v2/<repo>/manifests/<reference>
, where<repo>
may contain unescaped slashes./v2/library/ubuntu/manifests/latest
is thus a valid endpoint to get the manifests of the latest tag for thelibrary/ubuntu
repository.The problem is that naively,
Echo.GET("/v2/:repo/manifests/:reference")
doesn't work (which I expected), andEcho.GET("/v2/*/manifests/:reference")
does not work either, becausereference
does not get recognized. Worse, if you go with the second form, and register another endpoint like"/v2/*/blobs/:digest"
, the second one starts matching paths like"/v2/repo/manifests/foobar"
.It'd be great if this worked seamlessly. It doesn't seem crazy to have a greedy matching syntax for parameters that may contain slashes, like
/v2/:*repo/manifests/:reference
-- regular expressions are able to do just that, in fact.I've worked around this issue with a .Pre handler that rewrites the URL with path-escaped components based on a set of route-specific regular expressions:
Still, this is fairly half-baked, since the parameter values then need to be path-unescaped in the handler. It'd be nice if this was just plainly supported.
Checklist
Expected behaviour
Have a way to register routes with parameters containing unescaped slashes
Actual behaviour
Feature is missing
Steps to reproduce
(none, this is a feature request)
Working code to debug
(none, this is a feature request)
Version/commit
v4.6.1 (via go get)