kwhitley / itty-router

A little router.
MIT License
1.75k stars 78 forks source link

Get wildcard value #22

Closed vivaladan closed 3 years ago

vivaladan commented 3 years ago

Firstly, lovely little router this but one thing I'm not clear on. Is there a way in the handler, to get the value that a wildcard matched on. For example,

const router = Router()
router.get('/maps/tiles/vector/*', vectorTilesHandler)
router.get('/maps/tiles/static/*', staticTilesHandler)
router.get('/maps/packs/vector/*', vectorPacksHandler)
router.get('/maps/packs/static/*', staticPacksHandler)

I'd like to get the whole wildcard value in the handler. I can get the full url, but then that means I need to remove the host and duplicate the initial piece of path so I can strip that off the string.

Likewise I can't use parameters because you can't have a greedy parameter that takes all segments. The reason I need the wildcard is so i can proxy it as is (in these cases).

kwhitley commented 3 years ago

Hey @vivaladan - definitely not one built-in (basically the defined params turn into capture group names, whereas a wildcard has no in-built name). Let me do some playing here in a bit and get back to you...

Surely something like /api/:collection/actions/* would still capture the request.params.collection before wildcarding, no?

Still doesn't solve your use case though. Without adding too much bespoke logic for an edge case, what if the route itself were accessible to the handler (via the request)? Would that help?

vivaladan commented 3 years ago

If the route was available, then that would take me a step closer. I could take everything before the * and the trim that from the start of the actual path. Although I'd have to consider the best way to remove the host too.

There is no urgency for this, it's just an idea that may help others. For the time being I'm going to duplicate the route in the handler and use that to get everything that comes after.

vivaladan commented 3 years ago

To give a little context of the real world example here. I'm replacing an AWS API Gateway with a worker. On some paths, I will use their version of mapping segments against variables that feed into a Lambda to deal with. e.g. /:x/:y/:z.pbf. Your router already deals with this beautifully.

In other simpler paths that I can pass through after a certain point because I don't need to know, or it'll be variable, I'll use their {proxy+} route variable, which is a greedy wildcard. I can then slap {proxy} on the end of the origin to carry over just watch the wildcard matched on.

image

kwhitley commented 3 years ago

So... if I follow you (and if I don't, please correct me!)... this just sounds like a nested router (which hands off downstream requests to a sub router). The catch being that if you needed something from upstream, you'd need to inject it (via middleware) before the downstream route sees it.

// routers
const parentRouter = Router()
const childRouter = Router({ base: `this is what we need to figure out for your dynamic parent route` })

// child route
childRouter.get('/:action', request => 
  new Response(`Action ${request.params.action} firing on ${request.collection}`)
)

// middleware
const withCollection = request => {
  request.collection = request.params.collection // inject whatever you like into the request
}

// attach middleware to capture this route
parentRouter.get('/api/:collection/*', withCollection, childRouter.handle)
kwhitley commented 3 years ago

I'm gonna see if you can do a dynamic base path (e.g. /api/.*) or something... may have to escape it (/api/\.\*) since it's regex that creates regex (that's later matched to paths)... testing now

kwhitley commented 3 years ago

Another thing itty does (currently) is each new handler will override the route params if they were already set... I did this to a) save characters, and b) because I didn't imagine slowly building routes+params along several route matches or nested routers.

That may be worth the (non-trivial) character addition to support... testing now:

it('can allow multiple param injections', async () => {
  const router1 = Router()
  const router2 = Router()
  const handler = jest.fn(req => req.params)
  const middleware = () => {}

  router1.get('/:collection/*', middleware, router2.handle)
  router2.get('*', handler)

  await router1.handle(buildRequest({ path: '/items' }))
  expect(handler).toHaveBeenCalled()
  expect(handler).toHaveReturnedWith({ collection: 'items' })
})

This is showing an unrelated handler (matching on the downstream wildcard) still being able to access the req.params injected by the upstream route...

vivaladan commented 3 years ago

It's not the base path or parent that's dynamic, it's the end, that the * will match. So that's why I wanted to get that value, so I could chuck it on the end of another URL without having to parse it.

https://something.com/widget/v2/ And I'd make a fetch request to https://widget-api.com/api/... Where ... is whatever string matched on

While it doesn't solve my problem, the nested router feature is really cool though.

kwhitley commented 3 years ago

Gotcha! :thinking hat:

kwhitley commented 3 years ago

Ok, so I've verified I can capture that but... as this is an edge case, mind pointing me to other libs (any language) that capture the wildcard as a param? I want to see what they call it so I have an idea of what to name the capture group...

vivaladan commented 3 years ago

I'm not too sure about routing libraries. AWS call it a 'greedy path variable'. It just acts like a wildcard on the end of a route, but you get it's value as a variable called proxy

R167 commented 3 years ago

@kwhitley Here's an example of how Express (JS) and Sinatra (Ruby) handle unnamed wildcards:

From the docs, they state Express (v4) takes the approach of making these index based on params. For example, for the path /file/*, express exposes the splat parameter at req.params[0] (express req.params)

When you use a regular expression for the route definition, capture groups are provided in the array using req.params[n], where n is the nth capture group. This rule is applied to unnamed wild card matches with string routes such as /file/*:

// GET /file/javascripts/jquery.js
console.dir(req.params[0])
// => 'javascripts/jquery.js'

Alternatively, the Ruby library mustermann used by sinatra uses the default name "splat", so you can get the wildcard capture group at params["splat"].

pattern = Mustermann.new('/:prefix/*.*')
pattern.params('/a/b.c') # => { "prefix" => "a", splat => ["b", "c"] }

Hope that's useful!

kwhitley commented 3 years ago

Couple problems I ran into that may [sadly] be the thread-killer. It was one thing to capture the wildcard as another capture group and name it something. This differs a little from traditional "splat" handling, that really seemed to emulate old school printf statements by simply being an array of wildcard occurrences. As a capture group (just like a route param), we would capture one wildcard, and only one... as whatever param we named it internally (e.g. "splat" or "wildcard").

The problem with this approach is that currently multiple wildcards are perfectly supported, however ill-advised. If we were to do this, regex would throw anytime a second+ wildcard was included as it would collide with the named group. Without adding a ton of code for this edge case and including traditional splat support instead, rolling out this change would require a major version bump and still a fair bit of code addition for an edge case.

Ultimately we decided that to stay true to the origins - an absurdly tiny router that handled "most" cases really well, we'd have to leave this solution in "workaround land", and drop the PR. 😞

That said, I do really appreciate the convo, the patience in explaining the use-case, clarifications, etc!

Also, and maybe this is a possible path forward... I've just published the rough draft of itty-router-extras, that includes a wrapped version of itty (to automatically handle thrown exceptions). Doing something like this allows you to intercept both the handle method of the router, as well as any route-registering, to perhaps add this logic just upstream of itty. Would be an interesting experiment to see if we could get that to work...

Example code for this router (with library-specific error removed for clarity):

const { Router } = require('itty-router')

const ThrowableRouter = (options = {}) =>
  new Proxy(Router(options), {
    get: (obj, prop) => (...args) =>
        prop === 'handle'
        ? obj[prop](...args).catch(err => new Response(err.message, { status: 500 }))
        : obj[prop](...args)
  })

Advantages with this approach is it becomes a perfect drop-in for itty, anywhere in your code, without wrapper functions, different signatures, etc.

vivaladan commented 3 years ago

I definitely feel that a splat is far more useful and widely used than multiple wildcards would be. Given the choice between the two features, I would certainly pick the splat. With that said, all I have is anecdotal data and perhaps a bias because it would solve some problems I'm having. I love the project though, so am happy to take your steer on what is best for it. Thank you for looking into this all the same. The itty-router-extras sounds pretty cool too though and I like how you're not compromising the original projects goals in order to flesh it out. I'll be sure to take a look myself.

kwhitley commented 3 years ago

Thanks for understanding! When I get some free time this week, I'll try to take a stab at a splat enabled wrapper! :)