Closed karthik2804 closed 6 months ago
I'm not necessarily opposed to this but just want to note that V2 manifests help with a workaround in some (not all!) situations:
[[trigger.http]]
route = "/..."
component = "bartholomew"
[[trigger.http]]
route = "/spin/..."
component = "proxy"
[[trigger.http]]
route = "/spin/v1/..."
component = "bartholomew"
@lann that works in that specific case but if someone adds a /spin/v2
as well, then we would need to add another definition in the manifest. So I would imagine it would not scale very well.
@lann one other practical issue is that it increases the memory usage since the extra component mounts end up creating their own copy of the static files
With v2 manifests that shouldn't happen if you point two triggers at one component. If mounts are duplicated in that case it's a bug.
Oh ok - I was testing with just v1
and spin v1.5.1
. I was creating multiple components, so that makes sense.
@karthik2804 did Spin v2 and the v2 manifest help here? I'm assuming we'd still like this issue to track nested wildcard route support (eg route = "/spin/*/test"
mentioned above)?
Do we have a good sense of what we want from this feature? It seems like the ask is specifically for a kind of "single level" wildcard, i.e. like /...
except only up to the next /
(if there is one). Is that the full extent of it?
Some challenges I see here are:
/...
tails via spin-path-info
?/a/b/*/d
and /a/*/c/d
then which one should handle /a/b/c/d
?Any other design considerations?
The use case I have had requires only a single level of wildcard but based on @lann's workaround above, it could be done by adding multiple triggers.
spin-path-info
to be exposed as it is today. I am not sure I follow the question about "matched" segments. @karthik2804 Sorry I maybe confused the question by giving the spin-path-info
analogy - it was intended only as analogy. Let me rephrase:
Suppose I have a trigger /a/*/b
. Suppose a request matches that trigger, and my component runs. My component wants to know what the *
was. (For example, if the user visited /a/gorilla/b
, I want the gorilla
part.) How does it get it?
(Potentially nastier is the multiple case of /user/*/note/*
. Now from /user/itowlson/note/4067
my component wants to know that the wildcards were itowlson
and 4067
.)
@karthik2804 Is your comment about the multiple trigger workaround suggesting that this is no longer important (at least to you) and could be dropped from the 2.1 list? I am not sure if we have any other users asking after this yet.
@itowlson, it was in reference to the following
It seems like the ask is specifically for a kind of "single level" wildcard, i.e. like /... except only up to the next / (if there is one). Is that the full extent of it?
The workaround above solves the issue I ran into (I have yet to test).
On the sharing multiple matched segments, could add a new header URL paths can contains commas. spin-matched-segments: "itowlson,4067"
where the segments are separated by commas.
I chatted with Karthik offline and some alternative ideas for capturing *
values could be:
/user/*username/note/*noteid
giving rise to separate spin-something-username
and spin-something-noteid
headers.
- Allow/require names for wildcards e.g.
/user/*username/note/*noteid
giving rise to separatespin-something-username
andspin-something-noteid
headers.
FYI: From a customer perspective, this would be a really useful feature to have - personally, I would vote for named wildcards, but I'd be happy with any workable solution for enabling more powerful wildcard paths 😁
Regular Expressions are The Swiss Army Knife of Text Processing.
Flexible routing structure can give me more flexible / micro-sized modularization which I can split my app into more smaller wasms. Isn't that the recommended direction for wasm architecturing?
Given that https://github.com/fermyon/spin/pull/2305 is nearing completion, one thing I'd like to explore is using that service chaining feature to enable advanced routing scenarios via a routing component, especially for something like regular expressions which may not be appropriate for all users.
I started exploring this and I am getting very hung up on potential ambiguities. For example, if the app has routes /a/*/c
and /a/b/...
, which one does /a/b/c
match? I started going with a "longest initial exact segment takes precedence" rule, which runs into tiebreaking problems, but those are fixable. But then I started thinking "exact trailing segment should trump wildcard trailing segment".
And then I started thinking "what if we just reject applications with this kind of ambiguous overlap", requiring any pair of routes to be either disjoint or in a subset relationship. Because if you have /a/*/c
and /a/b/...
then you probably have problems that you need to address before the ambiguous request comes in. And this seems like a reaonable rule if it's easy to implement, but I bet it will turn out to be equivalent to the halting problem because everything always does. And it leaves people who really want to express a / anything-but-b / c
stuffed unless we extend the pattern language further (hello regular expressions, hello unbounded evaluation times).
Another possibility, which lines up with some programmatic routers, is to make it order-dependent: first (or last) match wins. Pre-1.0 Spin had that rule, and we replaced it with longest match because it provides more reliability as manifests change (e.g. triggers getting tacked on the end via spin add
). We could bring that back for the specific case of "best match is not well defined" but I'm really reluctant to do that because who knows what well-defined means in this case.
Finally, the original issue asked for single-segment match only in a trailing position, e.g. a *
suffix to go alongside the ...
one. That avoids any ambiguity, but falls short of what others in the thread have been talking about.
I feel like I'm overthinking this because lots of routers just do this stuff, so, save me from myself folks: how should this behave?
Could we lean into routefinder here? If we introduce named parameters (e.g. :planet
) it seems like we could establish a mapping to the syntax supported by routefinder (e.g. /...
=> /*
). This would appear to cover the usecases above.
Also, I'm not a big fan of supporting ...
and *
; historically I've seen them both used to mean "wildcard" and dread the confusion to inevitably plague our route dx.
The question then is how do we surface named captures to a guest? Immediately I'm not opposed to propagating captures as spin-route-param
headers (or something of the sort) as mentioned earlier.
The major benefit of creating an isomorphism to routerfinder (or similar syntax) is that we can take advantage of the ranking and ordering of routes to handle all the ambiguities mentioned.
I like routefinder
's precedence rules and am similarly wary of using *
since its meaning varies so much (any single segment vs any number of segments vs any number of trailing segments).
I'd be fine with e.g. /:param/...
(disallowing *
as an input) as the next step here. Given that we already have a bunch of magic spin-*
headers I suppose something like spin-param-<param>
is probably fine.
Yeah sorry *
was not a syntax proposal it was a way of expressing the ambiguity. The ambiguity, not the syntax, is the problem.
I will look at routefinder precedence: thanks.
Currently, the route matching in spin can be of two forms
It would be helpful in certain cases if the route matching allowed more granularity. For an example consider the case of a Bartholomew app with a proxy component.
With such a configuration, the
proxy
component would match the route/spin/index
but not/spin/v1/index
. Thus allowing theproxy
to handle one specific subset of the routes whilebartholomew
handles the rest.There is potential for another level of granularity where we could define something like
Which would only match
/spin/index/test
but not/spin/index/something-else
Thoughts?