microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
98.27k stars 12.2k forks source link

Ensure equal FileIncludeReasons are not duplicitively added #58352

Closed jakebailey closed 1 week ago

jakebailey commented 2 weeks ago

When we add file include reasons, they're effectively just stored per file path in a big list (a MultiMap). When we reuse a program, we keep using that same list, but may rediscover the same files over and over again, pushing more and more into that mapping.

Whenever we call createDiagnosticExplainingFile, we search through the file include reasons and include them in the related info. In the case of forceConsistentCasingInFileNames, we'll build a potentially massive diagnostic that references these other files.

Now, imagine the intersection of the two; I'm in eslint with a watch program, forceConsistentCasingInFileNames is the new default, but its diagnostics are never given to anyone. Maybe a file with the wrong case is queried over and over as the user types, it keeps getting added to a reused fileReasons map, then the forceConsistentCasingInFileNames just keeps building bigger and bigger diagnostics.

If we ensure that fileReason doesn't duplicitively include reasons, we avoid the blowup. It's still not great that there are bigish diagnostics, or that clients may query using a different path, etc, but I think this may handle the OOM. And the baseline change shows that duplication removes some errors.

It's also worth noting that DiagnosticCollection doesn't seem to handle deduplicating these diagnostics at all, probably due to the duplication or differing orders. But, it may after this PR? (Hard to observe.)

Relatd:

jakebailey commented 2 weeks ago

@typescript-bot test it @typescript-bot pack this

jakebailey commented 2 weeks ago

@typescript-bot test it @typescript-bot pack this

typescript-bot commented 2 weeks ago

Starting jobs; this comment will be updated as builds start and complete.

Command Status Results
test top400 ✅ Started ✅ Results
user test this ✅ Started ✅ Results
run dt ✅ Started ✅ Results
perf test this faster ✅ Started 👀 Results
pack this ✅ Started ✅ Results
typescript-bot commented 2 weeks ago

Hey @jakebailey, I've packed this into an installable tgz. You can install it for testing by referencing it in your package.json like so:

{
    "devDependencies": {
        "typescript": "https://typescript.visualstudio.com/cf7ac146-d525-443c-b23c-0d58337efebc/_apis/build/builds/161528/artifacts?artifactName=tgz&fileId=67D5A9311B7F95D1DC73F811183ED77CB70B1F5160B49B8D0BB164C993C5C00802&fileName=/typescript-5.5.0-insiders.20240429.tgz"
    }
}

and then running npm install.


There is also a playground for this build and an npm module you can use via "typescript": "npm:@typescript-deploys/pr-build@5.5.0-pr-58352-3".;

typescript-bot commented 2 weeks ago

Hey @jakebailey, the results of running the DT tests are ready.

Everything looks the same!

You can check the log here.

typescript-bot commented 2 weeks ago

@jakebailey Here are the results of running the user tests comparing main and refs/pull/58352/merge:

Everything looks good!

typescript-bot commented 2 weeks ago

@jakebailey The results of the perf run you requested are in!

Here they are:

tsc

Comparison Report - baseline..pr
Metric baseline pr Delta Best Worst p-value
Compiler-Unions - node (v18.15.0, x64)
Errors 30 30 ~ ~ ~ p=1.000 n=6
Symbols 62,154 62,154 ~ ~ ~ p=1.000 n=6
Types 50,273 50,273 ~ ~ ~ p=1.000 n=6
Memory used 192,812k (± 0.80%) 192,879k (± 0.73%) ~ 192,158k 195,729k p=0.521 n=6
Parse Time 1.36s (± 0.38%) 1.35s (± 1.21%) ~ 1.32s 1.36s p=0.324 n=6
Bind Time 0.72s 0.72s ~ ~ ~ p=1.000 n=6
Check Time 9.57s (± 0.30%) 9.58s (± 0.48%) ~ 9.54s 9.67s p=0.807 n=6
Emit Time 2.63s (± 0.59%) 2.62s (± 0.39%) ~ 2.60s 2.63s p=0.182 n=6
Total Time 14.27s (± 0.20%) 14.27s (± 0.26%) ~ 14.23s 14.33s p=0.747 n=6
angular-1 - node (v18.15.0, x64)
Errors 5 5 ~ ~ ~ p=1.000 n=6
Symbols 945,172 945,172 ~ ~ ~ p=1.000 n=6
Types 408,068 408,068 ~ ~ ~ p=1.000 n=6
Memory used 1,222,070k (± 0.00%) 1,222,036k (± 0.00%) -34k (- 0.00%) 1,222,006k 1,222,060k p=0.045 n=6
Parse Time 6.92s (± 0.46%) 6.96s (± 0.16%) +0.04s (+ 0.58%) 6.94s 6.97s p=0.008 n=6
Bind Time 1.87s (± 0.98%) 1.85s (± 0.22%) -0.02s (- 1.07%) 1.85s 1.86s p=0.025 n=6
Check Time 31.43s (± 0.38%) 31.28s (± 0.40%) ~ 31.09s 31.41s p=0.128 n=6
Emit Time 14.65s (± 0.65%) 14.68s (± 0.53%) ~ 14.56s 14.75s p=0.748 n=6
Total Time 54.87s (± 0.20%) 54.77s (± 0.25%) ~ 54.62s 54.96s p=0.199 n=6
mui-docs - node (v18.15.0, x64)
Errors 5 5 ~ ~ ~ p=1.000 n=6
Symbols 1,954,660 1,954,660 ~ ~ ~ p=1.000 n=6
Types 676,415 676,415 ~ ~ ~ p=1.000 n=6
Memory used 1,753,518k (± 0.00%) 1,753,526k (± 0.00%) ~ 1,753,516k 1,753,549k p=0.810 n=6
Parse Time 6.88s (± 0.31%) 7.13s (± 0.30%) +0.24s (+ 3.54%) 7.09s 7.15s p=0.005 n=6
Bind Time 2.31s (± 0.45%) 2.33s (± 0.36%) ~ 2.31s 2.33s p=0.109 n=6
Check Time 56.91s (± 0.30%) 56.63s (± 0.52%) ~ 56.22s 57.04s p=0.093 n=6
Emit Time 0.14s (± 5.31%) 0.14s (± 2.95%) ~ 0.13s 0.14s p=0.389 n=6
Total Time 66.25s (± 0.24%) 66.21s (± 0.45%) ~ 65.80s 66.65s p=0.689 n=6
self-build-src - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 1,215,567 1,215,638 +71 (+ 0.01%) ~ ~ p=0.001 n=6
Types 257,612 257,615 +3 (+ 0.00%) ~ ~ p=0.001 n=6
Memory used 2,323,296k (± 0.02%) 2,323,248k (± 0.02%) ~ 2,322,732k 2,323,921k p=0.810 n=6
Parse Time 5.06s (± 0.78%) 5.10s (± 0.79%) ~ 5.04s 5.15s p=0.199 n=6
Bind Time 1.90s (± 1.28%) 1.88s (± 1.23%) ~ 1.86s 1.92s p=0.257 n=6
Check Time 33.89s (± 0.40%) 34.08s (± 0.17%) +0.19s (+ 0.55%) 34.02s 34.14s p=0.020 n=6
Emit Time 2.64s (± 0.59%) 2.62s (± 0.98%) ~ 2.58s 2.65s p=0.106 n=6
Total Time 43.48s (± 0.31%) 43.68s (± 0.15%) +0.19s (+ 0.44%) 43.58s 43.76s p=0.031 n=6
self-build-src-public-api - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 1,215,567 1,215,638 +71 (+ 0.01%) ~ ~ p=0.001 n=6
Types 257,612 257,615 +3 (+ 0.00%) ~ ~ p=0.001 n=6
Memory used 2,397,259k (± 0.00%) 2,397,500k (± 0.02%) ~ 2,396,852k 2,398,164k p=0.230 n=6
Parse Time 6.30s (± 0.63%) 6.36s (± 0.84%) ~ 6.30s 6.45s p=0.078 n=6
Bind Time 2.01s (± 0.61%) 2.00s (± 0.84%) ~ 1.98s 2.03s p=0.131 n=6
Check Time 40.65s (± 0.19%) 40.70s (± 0.34%) ~ 40.54s 40.89s p=0.378 n=6
Emit Time 3.13s (± 1.41%) 3.15s (± 1.09%) ~ 3.10s 3.19s p=0.228 n=6
Total Time 52.12s (± 0.26%) 52.24s (± 0.30%) ~ 52.01s 52.45s p=0.128 n=6
self-compiler - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 256,196 256,212 +16 (+ 0.01%) ~ ~ p=0.001 n=6
Types 103,640 103,643 +3 (+ 0.00%) ~ ~ p=0.001 n=6
Memory used 424,218k (± 0.01%) 424,265k (± 0.00%) +47k (+ 0.01%) 424,251k 424,275k p=0.013 n=6
Parse Time 3.48s (± 0.66%) 3.50s (± 0.52%) ~ 3.48s 3.53s p=0.124 n=6
Bind Time 1.31s (± 1.74%) 1.30s (± 0.49%) ~ 1.29s 1.31s p=0.615 n=6
Check Time 18.19s (± 0.41%) 18.20s (± 0.35%) ~ 18.10s 18.30s p=0.936 n=6
Emit Time 1.38s (± 1.28%) 1.38s (± 1.64%) ~ 1.34s 1.41s p=0.934 n=6
Total Time 24.36s (± 0.33%) 24.38s (± 0.31%) ~ 24.25s 24.47s p=0.630 n=6
ts-pre-modules - node (v18.15.0, x64)
Errors 35 35 ~ ~ ~ p=1.000 n=6
Symbols 224,824 224,824 ~ ~ ~ p=1.000 n=6
Types 93,390 93,390 ~ ~ ~ p=1.000 n=6
Memory used 369,280k (± 0.01%) 369,339k (± 0.01%) +59k (+ 0.02%) 369,278k 369,410k p=0.037 n=6
Parse Time 3.68s (± 0.53%) 3.67s (± 0.89%) ~ 3.63s 3.72s p=0.468 n=6
Bind Time 1.92s (± 1.21%) 1.94s (± 1.01%) ~ 1.92s 1.96s p=0.411 n=6
Check Time 19.39s (± 0.29%) 19.45s (± 0.51%) ~ 19.31s 19.57s p=0.172 n=6
Emit Time 0.00s 0.00s ~ ~ ~ p=1.000 n=6
Total Time 24.98s (± 0.28%) 25.06s (± 0.31%) ~ 24.95s 25.16s p=0.149 n=6
vscode - node (v18.15.0, x64)
Errors 4 4 ~ ~ ~ p=1.000 n=6
Symbols 2,797,340 2,797,340 ~ ~ ~ p=1.000 n=6
Types 950,105 950,105 ~ ~ ~ p=1.000 n=6
Memory used 2,925,439k (± 0.00%) 2,925,543k (± 0.00%) ~ 2,925,445k 2,925,654k p=0.128 n=6
Parse Time 16.70s (± 0.48%) 16.91s (± 0.27%) +0.21s (+ 1.26%) 16.86s 16.98s p=0.005 n=6
Bind Time 5.01s (± 2.07%) 5.11s (± 2.48%) ~ 4.94s 5.21s p=0.574 n=6
Check Time 88.76s (± 0.44%) 88.91s (± 0.43%) ~ 88.18s 89.16s p=0.378 n=6
Emit Time 24.58s (± 7.30%) 23.92s (± 0.57%) ~ 23.75s 24.06s p=1.000 n=6
Total Time 135.06s (± 1.57%) 134.85s (± 0.22%) ~ 134.28s 135.08s p=0.109 n=6
webpack - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 265,853 265,853 ~ ~ ~ p=1.000 n=6
Types 108,438 108,438 ~ ~ ~ p=1.000 n=6
Memory used 410,420k (± 0.02%) 410,488k (± 0.02%) ~ 410,372k 410,577k p=0.199 n=6
Parse Time 4.88s (± 0.70%) 4.88s (± 1.16%) ~ 4.83s 4.99s p=0.687 n=6
Bind Time 2.06s (± 0.85%) 2.07s (± 0.68%) ~ 2.05s 2.09s p=0.743 n=6
Check Time 21.16s (± 0.32%) 21.15s (± 0.34%) ~ 21.01s 21.21s p=0.810 n=6
Emit Time 0.00s 0.00s ~ ~ ~ p=1.000 n=6
Total Time 28.10s (± 0.33%) 28.10s (± 0.38%) ~ 27.93s 28.26s p=1.000 n=6
xstate-main - node (v18.15.0, x64)
Errors 0 0 ~ ~ ~ p=1.000 n=6
Symbols 523,981 523,981 ~ ~ ~ p=1.000 n=6
Types 178,708 178,708 ~ ~ ~ p=1.000 n=6
Memory used 461,265k (± 0.03%) 461,254k (± 0.02%) ~ 461,164k 461,351k p=1.000 n=6
Parse Time 3.24s (± 0.58%) 3.24s (± 0.42%) ~ 3.22s 3.26s p=0.622 n=6
Bind Time 1.17s (± 0.44%) 1.17s (± 0.64%) ~ 1.16s 1.18s p=0.784 n=6
Check Time 18.15s (± 0.37%) 18.23s (± 0.24%) ~ 18.16s 18.27s p=0.052 n=6
Emit Time 0.00s 0.00s ~ ~ ~ p=1.000 n=6
Total Time 22.57s (± 0.25%) 22.64s (± 0.16%) +0.07s (+ 0.32%) 22.59s 22.69s p=0.037 n=6
System info unknown
Hosts
  • node (v18.15.0, x64)
Scenarios
  • Compiler-Unions - node (v18.15.0, x64)
  • angular-1 - node (v18.15.0, x64)
  • mui-docs - node (v18.15.0, x64)
  • self-build-src - node (v18.15.0, x64)
  • self-build-src-public-api - node (v18.15.0, x64)
  • self-compiler - node (v18.15.0, x64)
  • ts-pre-modules - node (v18.15.0, x64)
  • vscode - node (v18.15.0, x64)
  • webpack - node (v18.15.0, x64)
  • xstate-main - node (v18.15.0, x64)
Benchmark Name Iterations
Current pr 6
Baseline baseline 6

Developer Information:

Download Benchmarks

typescript-bot commented 2 weeks ago

@jakebailey Here are the results of running the top 400 repos comparing main and refs/pull/58352/merge:

Everything looks good!

sheetalkamat commented 2 weeks ago

When we reuse a program, we keep using that same list, but may rediscover the same files over and over again, pushing more and more into that mapping.

Thats not true.. Unless program is completely reused the fileReasons from old program are not reused so they should not change between programs.
What you have fixed here seems like a issue where file(a) may be included by say import from one of the file(b) that was imported though different root file(c). Then if that b is also root file, we will process its imports again so file c is added with same reason and that is what this deduplication is handling..

If you look for fileReasons it does not change between programs. Its only changed through findSourceFileWorker and that is not called when we are trying to determine if program can be reused. (tryReuseProgramStructure)

jakebailey commented 2 weeks ago

Gotcha; the deduping is somewhat nice though, but if it's not actually the problem then the deduping is probably just plain worse for perf and I should just ditch this PR.

jakebailey commented 1 week ago

Closing in favor of #58398, which includes some aspects of this PR.