vuejs / router

🚦 The official router for Vue.js
https://router.vuejs.org/
MIT License
3.96k stars 1.2k forks source link

There should be a simpler alternative to the`(.*)?` pattern #1301

Open zcrittendon opened 2 years ago

zcrittendon commented 2 years ago

Version

4.0.12

Reproduction link

jsfiddle.net

Steps to reproduce

  1. Create route containing multiple (.*)* patterns. For example path: '/:seoPath(.*)*/c/:categoryId/:facetsPath(.*)*'
  2. Trigger processing of route containing the (.*)* pattern with a long path matching the pattern.
  3. Observe a delay of seconds (~ 20 path segments) or minutes (~ 30 path segments)

This delay occurs when rendering router-link as well as during route navigation.

This fiddle has ~20 path segments and shows a 1-second delay when loading the page or clicking the link: https://jsfiddle.net/zcrittendon/j1fmu680/2/

This fiddle has ~30 path segments and shows shows a multi-minute delay (or browser hang) when loading the page: https://jsfiddle.net/zcrittendon/dmzjvq56/8/

This fiddle has has ~30 path segments, but has no delay because it uses pattern (.*)? instead of (.*)* https://jsfiddle.net/zcrittendon/9ze24hx7/5/

What is expected?

Router performance should not be longer than ~1 second even for very long wildcard URL paths.

What is actually happening?

Router performance is extremely long (or hands) for very long wildcard URL paths.

posva commented 2 years ago

If you don't need the array format you should just use (.*)?, otherwise the generated regexp becomes too complex which is what makes it slow.

posva commented 2 years ago

Although I think we could maybe introduce a new syntax: * to just match anything without creating a complex regex group. This needs more thought.

Mash19 commented 2 years ago

I was a little confused by the grouping but looking at the fiddles, I'm seeing that the slashes aren't part of the regex used for the parameters. I guess they are being handled internally, so instead of using the matchAll metacharacter you can use a character class and just include the characters you using in your route. This performs pretty well even with the greedy regex pattern. For the jsfiddle examples replacing the path with path: '/:seoPath([a-z]*)*/c/:categoryId/:facetsPath([-\\w]*)*', There is no hang loading or with the long route and the submatches are in the parameter array. Of course the regex is customized to the routes in the examples, and could be a little broader [a-zA-Z] instead of [a-z] to handle upper and lower case. Or [\w] if you are matching alphanumeric like the facetsPath parameter. There are a few other characters that are valid in an URL path that could be added but you'd probably be better off keeping it simple and only adding what you actually need.

Tofandel commented 2 months ago

I would also add that the pattern should allow for lazy quantifiers, right now any param match will always be greedy and this is very problematic

https://paths.esm.dev/?p=AAMeJSyA3BXABQKmLXANglpgEHQH0R4wAcl3LC4AjAFI8ASkAvA6gKgA4AeDEgDKz3ETgUMBAcAC0BpguIdQAABA&t=/hu/termek-kategoria/page/2/#

The previous path would work if I was allowed to add ?? as a param quantifier to produce the following regex: /^\/hu\/termek-kategoria(?:\/(.*?))??(?:\/(page\/\d+))?\/?$/

And if anything this should be the default, greedy matching makes no sense for a router

https://github.com/vuejs/router/blob/main/packages/router/src/matcher/pathParserRanker.ts#L190

This should basically be replaced from += '?' to += '??' or we should allow the tokenizer to have multiple tokens in the regex part

Tofandel commented 1 month ago

If you don't need the array format you should just use (.*)?, otherwise the generated regexp becomes too complex which is what makes it slow.

I would point out that this opens the router to a bug where navigating will url encode the parameter

Eg: You are on path /some/nested/path/c/123/facet, running router.replace({query: { test: 'test' }) will result in the path becoming /some%2Fnested%2Fpath/c/123/facet?test=test

2291

posva commented 1 month ago

FYI the regexp are already non greedy (.*? and .+?). Regarding the problem the other bug, in your case you can workaround it by explicitely reusing the route

router.replace({
    ...router.currentRoute.value,
    query: { n: Date.now() }
  })
Tofandel commented 1 month ago

FYI the regexp are already non greedy (.*? and .+?)

That's not true because our patterns are wrapped in (?:\/(.*?))? by vue-router

This is the generated regex from the tool

/^\/hu\/termek-kategoria(?:\/(.*?))?(?:\/(page\/\d+))?\/?$/i

As you can see it is greedy: image https://regex101.com/r/psZG8r/1

And because of that page/2 goes into the wrong parameter

This is what it needs to be lazy https://regex101.com/r/YDu8Dx/1 image

But we don't have any way to modify this quantifier because it is added by the core there

Maybe we can tweak it? If the last char of the regex is a ? then we add 2 ?? to the quantifier to respect the users wish for the regex to be lazy

This would result in (.*) still being (?:\/(.*))? but (.*?) becomes (?:\/(.*?))??

Tofandel commented 1 month ago

There also seems to be a problem in the path ranking if I split it into 2 routes with a trailing slash (which is one of our requirement)

https://paths.esm.dev/?p=AAMeJUyB3A7ACAROgDYMTAHuJXIFqHQAUAFACTg_OQA5INikft_qADWg4AUBBfMMSSMghySCcrsAAA..&t=/en/brands/abcd/efh/page/2#

Without a trailing slash the ranking is correct, I think the trailingslash should have a lower rank than a regex https://paths.esm.dev/?p=AAMeJUyB3A7ACAROgDYMTAHuJXIFqHQAUAFACTg_OQA5INikft_qADWgAQVDQMk8w6RBgNwSAIA.&t=/en/brands/abcd/efh/page/2#

Effectively I'm doomed

posva commented 1 month ago

You have multiple ways of handling it. Maybe a query param for the page instead of a param otherwise your routes are indeed redundant and the trailing slash makes it more specific. In your case the upcoming new custom matches could help