BobbieGoede commented 4 months ago

Reproduction

https://stackblitz.com/edit/github-afjnf8?file=src%2Frouter.ts

Steps to reproduce the bug

Run project and test the performance using ab or siege and compare the results.
- I tested with the following command siege -t 10s -c 100 "http://localhost:3000/" to see the amount of requests within 10s using 100 concurrent users.
Change the dirCount variable to adjust the amount of routes generated/used (default 100, which generates 1101 routes). and test again.

Expected behavior

Expected the amount of routes to have an impact on performance comparable to router v3. The performance difference is noticeable for SSR as a new router instance is created on each request.

Actual behavior

Router creation doesn't scale as well as it did in v3 (presumably, I have no clean reproduction with v3 Router alone), please view the result comparison in this issue https://github.com/nuxt/nuxt/issues/25612 which compares Nuxt 2 and Nuxt 3.

If you have a clean/plain SSR project using router v3 I would be happy to include a reproduction with a performance comparison between v3 and v4, I tried making one but had no luck.

Additional information

While debugging performance issues for @nuxtjs/i18n narrowed it down to the difference in performance between Nuxt 2 and Nuxt 3, specifically the performance impact of having a high amount of routes. Related issue here: https://github.com/nuxt/nuxt/issues/25612

Due to the way i18n routing works (same as in i18n for Nuxt 2), routes are generated/multiplied by the amount of configured languages. Of course I will look into alternative ways to make routing more efficient/performant when using i18n, but it's worth looking into why the performance of router creation has gotten worse and how we can improve it.

My first guess would be that insertMatcher doesn't scale very well, I couldn't find its equivalent in v3.

posva commented 4 months ago

Thanks for the investigation.

It would be interesting to see an actual perf benchmark (without nuxt). The difference in creation shouldn't be that big. That said, Vue Router 3 had no path sorting like v4. It would also allow to have a flame graph but I think you are right with your guess of insertMatcher 😄

Initially, I designed the router to accept a custom matcher implementation (e.g. a trie) that would be more performant but with fewer features than v3. I never got the time to actually work on it, though.

I also thought (but never implemented it) of a way to pre-compile the routes so the matcher can be created instantly. I think this could be an interesting feature, especially for Nuxt where this can be automated. Combined with the ability to pass a custom route matcher, this should work out pretty well!

BobbieGoede commented 4 months ago

I was overthinking how to create a perf comparison😅.. created one here https://github.com/BobbieGoede/vue-router-perf-bench which tries to compare both router versions and the time it takes to create the router 100 times with X routes. Maybe it can be changed to make the comparison fairer, but I think it demonstrates the performance difference.

I got these results on my machine: (I also checked flamegraph and didn't see insertMatcher for some reason)

version/routes	11	110	550	1100	2200
RouterV3	6.23ms	22.64ms	100.24ms	197.53ms	403.65ms
RouterV4	9.95ms	42.49ms	652.33ms	2616.43ms	10295.62ms

Custom matchers and/or pre-compilation definitely sound like good features to allow optimizing router/matcher creation. I'm not sure if I have the skills (and time) to add such functionality by myself but if I can help and contribute let me know 😄

posva commented 4 months ago

I checked on https://lahmatiy.github.io/cpupro/ with the cpuprofile file and the insertMatcher is what takes the most time!

Feel free to give the topic some thought and try to raise a PR. No rush at all

BobbieGoede commented 4 months ago

I did some research into what it would take to make a custom matcher for the router, and looked into tries as well. A custom matcher using a prefix (or radix) trie could definitely improve performance, hopefully I have time to look into making a matcher that fully matches the functionality/features of the current matcher. I came across https://github.com/unjs/radix3, if I can't wrap it in a custom matcher at least I will be able to use it as reference.

Since route sorting seems to be the current performance bottleneck (and after seeing unjs/radix3 export method) I tried making a sort cache (#2136), I figured this would be easier to implement to improve the performance in the short term.

It definitely works in speeding up router creation, though not entirely sure if this implementation holds up for sorting accuracy.	version/routes	11	110	550	1100
RouterV3	5.88ms	21.90ms	100.11ms	200.38ms	405.75ms
RouterV4	10.72ms	40.35ms	601.14ms	2386.12ms	9649.10ms
RouterV4Cached	5.26ms	32.52ms	141.46ms	275.60ms	553.46ms

While it does speed up router creation in general, it looks like matcher creation remains a general bottleneck (tokesToParser mostly)~~, and after creation it looks like resolve takes up some time as well~~ (https://lahmatiy.github.io/cpupro/ is very useful 😄). I will look into expanding my performance comparison bench to also test how resolve scales with the amount of routes.

posva commented 4 months ago

Using radix3 for a custom matcher would be nice. Note it doesn't cover all features (it can't as a trie). It's really great to see that there is no performance issue anymore with the cache 🙂 . Thanks a lot for taking the time to do it. I haven't been able to check the PR code yet but hopefully I can give you some feedback soon.

tokensToParser should be quite fast already. The difference you see might be because router v3 will cache path-to-regexp results while v4 doesn't cache the results of tokensToParser. When I tested (a while ago), it was faster than path-to-regexp and caching wasn't necessary, so I wouldn't spend too much time into that part 😄

BobbieGoede commented 4 months ago

tokensToParser should be quite fast already. ... When I tested (a while ago), it was faster than path-to-regexp and caching wasn't necessary, so I wouldn't spend too much time into that part 😄

You're right, it's fast 😅. Maybe the regex creation could be cached but it's easy to get carried away when looking for optimizations. As long as we can get the performance similar to that of Router v3 I'm happy 😄.

Using radix3 for a custom matcher would be nice. Note it doesn't cover all features (it can't as a trie).

From the perspective of i18n a trie just sounds appealing since most if not all routes are prefixed with a language. I suppose a separate matcher per language could be used for Nuxt I18n to achieve something similar (quick access to prefixed routes) while keeping it functionally the same.

skirtles-code commented 4 months ago

I've attempted a different approach to speeding up insertMatcher in #2137.

It uses a binary search to determine the insertion point. It yields a slightly different sort order, but I think it will be functionally equivalent. It's quite possible that I've failed to consider some of the edge cases, but the existing unit tests all pass, so I'm optimistic it can be made to work. If nothing else it may help to identify some missing test cases.

BobbieGoede commented 4 months ago

@skirtles-code Your approach is much better! This essentially makes use of matchers already being sorted right?

In my tests it performs better than my caching approach, I guess this probably means my caching is more expensive than it should/could be. I still think it's interesting to explore caching but it would probably need to cover more of the router matcher creation (not just the sorting) for it to be worth caching after your optimization.

skirtles-code commented 4 months ago

This essentially makes use of matchers already being sorted right?

Correct.

In my tests it performs better than my caching approach

Same for me, but my PR isn't currently handling all the edge cases correctly. Fixing that will likely slow it down a bit, which may account for some of the difference we're seeing.

I still think it's interesting to explore caching

Agreed. Caching should be able to improve various aspects of the performance, not just the insertion.

I've just opened another PR, #2148. This explores some other ideas. I've done a long write up on the PR, but roughly speaking it uses some of the ideas already discussed earlier in this thread (not including caching and binary search) to try to squeeze out extra performance.

skirtles-code commented 3 months ago

In #2166, I've explored the idea of making the matcher pluggable. This allows a custom matcher to be passed when creating the router. More importantly, it allows most of the work of creating a matcher to be done upfront, rather than creating it afresh for each request.

This is essentially a version of the caching/pre-compilation idea that has already been discussed.

2166 will require more work, but it should be complete enough to explore the viability of the ideas and decide which way to proceed.

meteorlxy commented 3 months ago

We also found vue-router could be a perf bottleneck when the routes amount is large. Will put an eye on this issue 👀 cc @Mister-Hope

FlorianWerndl commented 2 months ago

Is this topic still being worked on? Experienced performance issues within an multi-language application with about 1100 routes as well.

posva commented 1 month ago

FYI I added this to the roadmap. The idea is to go with https://github.com/vuejs/router/pull/2166. That way, the whole creation of the routes could be removed at build time with unplugin-vue-router in the future

skirtles-code commented 1 month ago

@posva I think #2137 (binary search for insertMatcher) is also worth considering. I believe that fixes the current performance problems and (in my opinion) it's ready to be merged. It requires no changes to the public API and would also benefit those who can't use the pluggable matcher for whatever reason.

I just checked the impact on the bundle size and it adds about 78 bytes to vue-router.global.prod.js, not accounting for compression.

posva commented 1 month ago

You make a good point. I will try to review it a look next week

posva commented 3 weeks ago

This will be partially fixed by #2137

┌─────────────┬──────────┬───────────┬────────────┬─────────────┬─────────────┐
│ (index)     │ 11       │ 110       │ 550        │ 1100        │ 2200        │
├─────────────┼──────────┼───────────┼────────────┼─────────────┼─────────────┤
│ RouterV3    │ '4.05ms' │ '17.61ms' │ '73.11ms'  │ '147.40ms'  │ '317.14ms'  │
│ RouterV4    │ '7.51ms' │ '36.69ms' │ '524.71ms' │ '1862.11ms' │ '7311.56ms' │
│ RouterVEdge │ '5.79ms' │ '22.57ms' │ '105.15ms' │ '218.72ms'  │ '462.07ms'  │
└─────────────┴──────────┴───────────┴────────────┴─────────────┴─────────────┘

But the idea is to still allow a pluggable matcher so I will keep this open. Thanks a lot @skirtles-code for the PR and @BobbieGoede for the benchmark!

posva commented 3 weeks ago

I published a patch version. So let me know how it goes

vuejs / router

Router creation performance degradation when using many routes in comparison to v3 #2132

Reproduction

Steps to reproduce the bug

Expected behavior

Actual behavior

Additional information

2166 will require more work, but it should be complete enough to explore the viability of the ideas and decide which way to proceed.