microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
98.27k stars 12.2k forks source link

RegExp syntax checking performance #58339

Closed rbuckton closed 2 weeks ago

rbuckton commented 2 weeks ago

This makes a few changes to the RegExp syntax checking algorithm for possible performance improvements. This also fixes a small typo in one of the RegExp diagnostic messages.

One of the changes is to ensure we perform proper bounds checks when pos might advance past end. To accomplish this, I added the charCodeChecked and charCodeUnchecked functions. charCodeChecked performs boundary tests which both improves reliability and avoids potential deoptimizations due to reading past the end of a string. charCodeUnchecked is simply a wrapper for text.charCodeAt and exists primarily to make it easier to readily identify unchecked vs checked reads from text. charCodeUnchecked should only be used when we've already performed a bounds check before reading a character, such as in a while (pos < end) loop. I've also done the same for code points with codePointChecked and codePointUnchecked. Each of these functions is small, so ideally will be inlined by an optimizing compiler like the one used by V8.

I've primarily focused on optimizing scanRegularExpressionWorker for now. If this seems effective, I will investigate adopting charCodeChecked in more places in the scanner in a later PR.

I've also made a few other small perf improvements:

rbuckton commented 2 weeks ago

@typescript-bot perf test

typescript-bot commented 2 weeks ago

Starting jobs; this comment will be updated as builds start and complete.

Command Status Results
perf test ✅ Started ❌ Results
typescript-bot commented 2 weeks ago

@rbuckton, the perf run you requested failed. You can check the log here.

typescript-bot commented 2 weeks ago

@rbuckton The results of the perf run you requested are in!

Here they are:

tsc

Comparison Report - baseline..pr
Metric baseline pr Delta Best Worst p-value
Compiler-Unions - node (v18.15.0, x64)
Errors 30 30 ~ ~ ~ p=1.000 n=6
Symbols 62,154 62,154 ~ ~ ~ p=1.000 n=6
Types 50,273 50,273 ~ ~ ~ p=1.000 n=6
Memory used 192,825k (± 0.74%) 192,796k (± 0.75%) ~ 192,135k 195,762k p=0.521 n=6
Parse Time 2.02s (± 1.38%) 1.95s (± 0.46%) 🟩-0.07s (- 3.62%) 1.94s 1.96s p=0.005 n=6
Bind Time 1.07s (± 1.09%) 1.07s (± 0.96%) ~ 1.06s 1.08s p=0.437 n=6
Check Time 14.04s (± 0.08%) 14.08s (± 0.29%) +0.04s (+ 0.30%) 14.04s 14.13s p=0.040 n=6
Emit Time 3.87s (± 0.90%) 3.88s (± 0.44%) ~ 3.86s 3.90s p=0.627 n=6
Total Time 21.00s (± 0.09%) 20.98s (± 0.23%) ~ 20.91s 21.04s p=0.686 n=6
angular-1 - node (v18.15.0, x64)
Errors 5 5 ~ ~ ~ p=1.000 n=6
Symbols 945,172 945,172 ~ ~ ~ p=1.000 n=6
Types 408,068 408,068 ~ ~ ~ p=1.000 n=6
Memory used 1,222,036k (± 0.00%) 1,222,022k (± 0.00%) ~ 1,221,973k 1,222,117k p=0.378 n=6
Parse Time 6.92s (± 0.50%) 6.78s (± 0.50%) -0.15s (- 2.14%) 6.75s 6.84s p=0.005 n=6
Bind Time 1.87s (± 0.71%) 1.87s (± 0.28%) ~ 1.87s 1.88s p=0.672 n=6
Check Time 31.41s (± 0.32%) 31.36s (± 0.17%) ~ 31.27s 31.42s p=0.295 n=6
Emit Time 14.57s (± 0.82%) 14.65s (± 0.75%) ~ 14.56s 14.86s p=0.809 n=6
Total Time 54.77s (± 0.42%) 54.66s (± 0.33%) ~ 54.52s 55.00s p=0.297 n=6
mui-docs - node (v18.15.0, x64)
Errors 5 5 ~ ~ ~ p=1.000 n=6
Symbols 1,954,635 1,954,635 ~ ~ ~ p=1.000 n=6
Types 676,418 676,418 ~ ~ ~ p=1.000 n=6
Memory used 1,753,442k (± 0.00%) 1,753,451k (± 0.00%) ~ 1,753,422k 1,753,484k p=0.630 n=6
Parse Time 6.89s (± 0.31%) 6.73s (± 0.35%) -0.17s (- 2.42%) 6.70s 6.76s p=0.005 n=6
Bind Time 2.31s (± 0.18%) 2.31s (± 0.33%) ~ 2.30s 2.32s p=1.000 n=6
Check Time 56.81s (± 0.11%) 56.85s (± 0.45%) ~ 56.63s 57.29s p=0.470 n=6
Emit Time 0.14s (± 2.88%) 0.14s ~ ~ ~ p=0.405 n=6
Total Time 66.16s (± 0.13%) 66.03s (± 0.38%) ~ 65.82s 66.43s p=0.297 n=6
self-build-src - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 1,215,567 1,215,606 +39 (+ 0.00%) ~ ~ p=0.001 n=6
Types 257,612 257,638 +26 (+ 0.01%) ~ ~ p=0.001 n=6
Memory used 2,323,147k (± 0.03%) 2,323,519k (± 0.02%) ~ 2,323,071k 2,324,130k p=0.471 n=6
Parse Time 7.56s (± 0.61%) 7.44s (± 1.06%) -0.12s (- 1.59%) 7.34s 7.57s p=0.016 n=6
Bind Time 2.75s (± 0.48%) 2.75s (± 0.85%) ~ 2.73s 2.78s p=0.936 n=6
Check Time 49.78s (± 0.59%) 49.66s (± 0.78%) ~ 49.29s 50.38s p=0.471 n=6
Emit Time 3.91s (± 1.63%) 3.95s (± 3.43%) ~ 3.85s 4.22s p=0.810 n=6
Total Time 64.01s (± 0.48%) 63.82s (± 0.61%) ~ 63.55s 64.57s p=0.173 n=6
self-build-src-public-api - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 1,215,567 1,215,606 +39 (+ 0.00%) ~ ~ p=0.001 n=6
Types 257,612 257,638 +26 (+ 0.01%) ~ ~ p=0.001 n=6
Memory used 2,396,896k (± 0.01%) 2,397,995k (± 0.02%) +1,099k (+ 0.05%) 2,397,126k 2,398,511k p=0.008 n=6
Parse Time 5.25s (± 1.17%) 5.15s (± 0.75%) -0.09s (- 1.78%) 5.10s 5.20s p=0.016 n=6
Bind Time 1.70s (± 1.33%) 1.69s (± 1.71%) ~ 1.64s 1.73s p=0.870 n=6
Check Time 34.43s (± 0.21%) 34.41s (± 0.17%) ~ 34.34s 34.47s p=0.471 n=6
Emit Time 2.63s (± 0.83%) 2.65s (± 1.54%) ~ 2.60s 2.70s p=0.575 n=6
Total Time 44.02s (± 0.23%) 43.91s (± 0.22%) ~ 43.81s 44.04s p=0.199 n=6
self-compiler - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 256,196 256,224 +28 (+ 0.01%) ~ ~ p=0.001 n=6
Types 103,640 103,656 +16 (+ 0.02%) ~ ~ p=0.001 n=6
Memory used 424,190k (± 0.01%) 424,200k (± 0.01%) ~ 424,167k 424,242k p=0.471 n=6
Parse Time 3.49s (± 1.55%) 3.33s (± 0.35%) 🟩-0.16s (- 4.72%) 3.31s 3.34s p=0.005 n=6
Bind Time 1.30s (± 0.90%) 1.30s (± 0.84%) ~ 1.29s 1.32s p=0.865 n=6
Check Time 18.21s (± 0.37%) 18.18s (± 0.39%) ~ 18.09s 18.27s p=0.687 n=6
Emit Time 1.36s (± 1.28%) 1.38s (± 0.89%) ~ 1.36s 1.39s p=0.241 n=6
Total Time 24.37s (± 0.31%) 24.18s (± 0.32%) -0.18s (- 0.76%) 24.06s 24.25s p=0.005 n=6
ts-pre-modules - node (v18.15.0, x64)
Errors 35 35 ~ ~ ~ p=1.000 n=6
Symbols 224,824 224,824 ~ ~ ~ p=1.000 n=6
Types 93,390 93,390 ~ ~ ~ p=1.000 n=6
Memory used 369,410k (± 0.05%) 369,296k (± 0.00%) -114k (- 0.03%) 369,279k 369,315k p=0.037 n=6
Parse Time 3.67s (± 0.38%) 3.53s (± 1.05%) 🟩-0.14s (- 3.91%) 3.49s 3.59s p=0.005 n=6
Bind Time 1.95s (± 1.41%) 1.94s (± 0.78%) ~ 1.92s 1.96s p=0.413 n=6
Check Time 19.48s (± 0.41%) 19.44s (± 0.24%) ~ 19.36s 19.49s p=0.195 n=6
Emit Time 0.00s 0.00s ~ ~ ~ p=1.000 n=6
Total Time 25.10s (± 0.30%) 24.90s (± 0.21%) -0.20s (- 0.80%) 24.83s 24.97s p=0.005 n=6
vscode - node (v18.15.0, x64)
Errors 4 4 ~ ~ ~ p=1.000 n=6
Symbols 2,797,149 2,797,149 ~ ~ ~ p=1.000 n=6
Types 950,053 950,053 ~ ~ ~ p=1.000 n=6
Memory used 2,925,215k (± 0.00%) 2,925,305k (± 0.00%) ~ 2,925,155k 2,925,515k p=0.575 n=6
Parse Time 13.56s (± 0.28%) 13.32s (± 0.30%) -0.24s (- 1.79%) 13.26s 13.38s p=0.005 n=6
Bind Time 4.12s (± 2.10%) 4.09s (± 0.25%) ~ 4.07s 4.10s p=0.615 n=6
Check Time 73.08s (± 0.39%) 73.11s (± 0.48%) ~ 72.52s 73.49s p=0.689 n=6
Emit Time 21.41s (± 9.45%) 21.03s (± 8.39%) ~ 19.61s 23.52s p=0.810 n=6
Total Time 112.17s (± 1.88%) 111.54s (± 1.59%) ~ 110.03s 114.21s p=0.575 n=6
webpack - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 265,853 265,853 ~ ~ ~ p=1.000 n=6
Types 108,438 108,438 ~ ~ ~ p=1.000 n=6
Memory used 410,385k (± 0.02%) 410,331k (± 0.01%) ~ 410,276k 410,388k p=0.109 n=6
Parse Time 3.95s (± 0.56%) 3.84s (± 0.61%) -0.11s (- 2.74%) 3.80s 3.87s p=0.005 n=6
Bind Time 1.69s (± 0.79%) 1.67s (± 0.63%) -0.02s (- 1.38%) 1.65s 1.68s p=0.017 n=6
Check Time 17.05s (± 0.41%) 17.06s (± 0.30%) ~ 16.97s 17.10s p=0.936 n=6
Emit Time 0.00s 0.00s ~ ~ ~ p=1.000 n=6
Total Time 22.69s (± 0.36%) 22.57s (± 0.33%) -0.12s (- 0.54%) 22.43s 22.64s p=0.045 n=6
xstate-main - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 523,981 523,981 ~ ~ ~ p=1.000 n=6
Types 178,708 178,708 ~ ~ ~ p=1.000 n=6
Memory used 461,203k (± 0.01%) 461,166k (± 0.01%) ~ 461,120k 461,206k p=0.298 n=6
Parse Time 2.69s (± 0.61%) 2.61s (± 0.56%) -0.08s (- 2.92%) 2.59s 2.63s p=0.005 n=6
Bind Time 0.98s (± 0.64%) 0.99s (± 0.82%) +0.01s (+ 1.36%) 0.98s 1.00s p=0.023 n=6
Check Time 15.47s (± 0.36%) 15.31s (± 0.26%) -0.16s (- 1.02%) 15.26s 15.36s p=0.005 n=6
Emit Time 0.00s 0.00s ~ ~ ~ p=1.000 n=6
Total Time 19.14s (± 0.24%) 18.91s (± 0.19%) -0.23s (- 1.18%) 18.87s 18.97s p=0.005 n=6
System info unknown
Hosts
  • node (v18.15.0, x64)
Scenarios
  • Compiler-Unions - node (v18.15.0, x64)
  • angular-1 - node (v18.15.0, x64)
  • mui-docs - node (v18.15.0, x64)
  • self-build-src - node (v18.15.0, x64)
  • self-build-src-public-api - node (v18.15.0, x64)
  • self-compiler - node (v18.15.0, x64)
  • ts-pre-modules - node (v18.15.0, x64)
  • vscode - node (v18.15.0, x64)
  • webpack - node (v18.15.0, x64)
  • xstate-main - node (v18.15.0, x64)
Benchmark Name Iterations
Current pr 6
Baseline baseline 6

tsserver

Comparison Report - baseline..pr
Metric baseline pr Delta Best Worst p-value
Compiler-UnionsTSServer - node (v18.15.0, x64)
Req 1 - updateOpen 3,504ms (± 0.33%) 3,421ms (± 0.43%) -83ms (- 2.36%) 3,399ms 3,436ms p=0.005 n=6
Req 2 - geterr 7,555ms (± 0.59%) 7,546ms (± 0.55%) ~ 7,478ms 7,603ms p=0.575 n=6
Req 3 - references 439ms (± 1.96%) 428ms (± 1.03%) -12ms (- 2.62%) 423ms 436ms p=0.016 n=6
Req 4 - navto 336ms (± 0.36%) 337ms (± 0.75%) ~ 334ms 341ms p=0.803 n=6
Req 5 - completionInfo count 1,357 1,357 ~ ~ ~ p=1.000 n=6
Req 5 - completionInfo 112ms (± 1.04%) 116ms (± 8.29%) ~ 111ms 135ms p=0.867 n=6
CompilerTSServer - node (v18.15.0, x64)
Req 1 - updateOpen 3,701ms (± 1.21%) 3,581ms (± 0.38%) 🟩-120ms (- 3.25%) 3,556ms 3,593ms p=0.005 n=6
Req 2 - geterr 5,702ms (± 0.16%) 5,685ms (± 0.40%) ~ 5,652ms 5,715ms p=0.128 n=6
Req 3 - references 445ms (± 0.46%) 443ms (± 0.46%) ~ 441ms 447ms p=0.145 n=6
Req 4 - navto 341ms (± 1.35%) 340ms (± 1.31%) ~ 333ms 346ms p=1.000 n=6
Req 5 - completionInfo count 1,519 1,519 ~ ~ ~ p=1.000 n=6
Req 5 - completionInfo 111ms (± 6.02%) 109ms (± 0.90%) ~ 107ms 110ms p=0.929 n=6
xstate-main-1-tsserver - node (v18.15.0, x64)
Req 1 - updateOpen 6,228ms (± 0.67%) 7,346ms (± 8.35%) ~ 6,095ms 7,632ms p=0.066 n=6
Req 2 - geterr 1,633ms (± 8.49%) 1,627ms (± 8.59%) ~ 1,342ms 1,695ms p=0.810 n=6
Req 3 - references 127ms (± 3.66%) 123ms (± 9.38%) ~ 100ms 133ms p=0.934 n=6
Req 4 - navto 588ms (± 2.73%) 584ms (± 2.54%) ~ 573ms 609ms p=0.871 n=6
Req 5 - completionInfo count 3,413 3,413 ~ ~ ~ p=1.000 n=6
Req 5 - completionInfo 1,273ms (± 1.81%) 1,261ms (± 1.97%) ~ 1,228ms 1,304ms p=0.423 n=6
System info unknown
Hosts
  • node (v18.15.0, x64)
Scenarios
  • CompilerTSServer - node (v18.15.0, x64)
  • Compiler-UnionsTSServer - node (v18.15.0, x64)
  • xstate-main-1-tsserver - node (v18.15.0, x64)
Benchmark Name Iterations
Current pr 6
Baseline baseline 6

startup

Comparison Report - baseline..pr
Metric baseline pr Delta Best Worst p-value
tsc-startup - node (v18.15.0, x64)
Execution time 227.45ms (± 0.18%) 227.31ms (± 0.17%) -0.14ms (- 0.06%) 225.53ms 231.35ms p=0.001 n=600
tsserver-startup - node (v18.15.0, x64)
Execution time 359.11ms (± 0.27%) 359.09ms (± 0.28%) ~ 350.85ms 367.65ms p=0.655 n=600
tsserverlibrary-startup - node (v18.15.0, x64)
Execution time 287.26ms (± 0.28%) 287.27ms (± 0.32%) ~ 280.38ms 306.23ms p=0.291 n=600
typescript-startup - node (v18.15.0, x64)
Execution time 286.91ms (± 0.30%) 286.91ms (± 0.28%) ~ 280.02ms 293.59ms p=0.230 n=600
System info unknown
Hosts
  • node (v18.15.0, x64)
Scenarios
  • tsc-startup - node (v18.15.0, x64)
  • tsserver-startup - node (v18.15.0, x64)
  • tsserverlibrary-startup - node (v18.15.0, x64)
  • typescript-startup - node (v18.15.0, x64)
Benchmark Name Iterations
Current pr 6
Baseline baseline 6

Developer Information:

Download Benchmarks

rbuckton commented 2 weeks ago

@typescript-bot run dt @typescript-bot test top400 @typescript-bot test tsserver top100 @typescript-bot user test this @typescript-bot user test tsserver

typescript-bot commented 2 weeks ago

Starting jobs; this comment will be updated as builds start and complete.

Command Status Results
run dt ✅ Started ✅ Results
test top400 ✅ Started ✅ Results
test tsserver top100 ✅ Started
user test this ✅ Started ✅ Results
user test tsserver ✅ Started ✅ Results
typescript-bot commented 2 weeks ago

@rbuckton Here are the results of running the user tests comparing main and refs/pull/58339/merge:

Everything looks good!

typescript-bot commented 2 weeks ago

Hey @rbuckton, the results of running the DT tests are ready.

Everything looks the same!

You can check the log here.

typescript-bot commented 2 weeks ago

@rbuckton Here are the results of running the user tests comparing main and refs/pull/58339/merge:

Everything looks good!

typescript-bot commented 2 weeks ago

@rbuckton Here are the results of running the top 400 repos comparing main and refs/pull/58339/merge:

Everything looks good!

jakebailey commented 2 weeks ago

The perf seels really good. Is that all from avoiding out of bounds?

rbuckton commented 2 weeks ago

The perf seels really good. Is that all from avoiding out of bounds?

Yeah, if charCodeAt is called for an out of bounds index it returns NaN, which isn't an SMI. IIRC, accessing a string out of bounds also results in sub optional compilation as it requires a recompile to introduce additional checks in the compiled code.

DanielRosenwasser commented 2 weeks ago

Might be worth mentioning in the PR description that you also var-ified the regular expression parser (https://github.com/microsoft/TypeScript/issues/52924).

DanielRosenwasser commented 2 weeks ago

Also other specific optimizations - like hoisting out helper functions and lazily initializing state for captures.

rbuckton commented 2 weeks ago

I've been working on expanding this throughout scanner.ts and writing up a lint rule to ensure enforcement, so I'll probably need another review shortly.

rbuckton commented 2 weeks ago

I've been working on expanding this throughout scanner.ts and writing up a lint rule to ensure enforcement, so I'll probably need another review shortly.

Since I have some additional testing to do, I think I'll put those changes up as a separate PR following this one.