Ensure equal FileIncludeReasons are not duplicitively added

jakebailey commented 2 weeks ago

When we add file include reasons, they're effectively just stored per file path in a big list (a MultiMap). When we reuse a program, we keep using that same list, but may rediscover the same files over and over again, pushing more and more into that mapping.

Whenever we call createDiagnosticExplainingFile, we search through the file include reasons and include them in the related info. In the case of forceConsistentCasingInFileNames, we'll build a potentially massive diagnostic that references these other files.

Now, imagine the intersection of the two; I'm in eslint with a watch program, forceConsistentCasingInFileNames is the new default, but its diagnostics are never given to anyone. Maybe a file with the wrong case is queried over and over as the user types, it keeps getting added to a reused fileReasons map, then the forceConsistentCasingInFileNames just keeps building bigger and bigger diagnostics.

If we ensure that fileReason doesn't duplicitively include reasons, we avoid the blowup. It's still not great that there are bigish diagnostics, or that clients may query using a different path, etc, but I think this may handle the OOM. And the baseline change shows that duplication removes some errors.

It's also worth noting that DiagnosticCollection doesn't seem to handle deduplicating these diagnostics at all, probably due to the duplication or differing orders. But, it may after this PR? (Hard to observe.)

Relatd:

jakebailey commented 2 weeks ago

@typescript-bot test it @typescript-bot pack this

jakebailey commented 2 weeks ago

@typescript-bot test it @typescript-bot pack this

typescript-bot commented 2 weeks ago

Starting jobs; this comment will be updated as builds start and complete.

Command	Status	Results
`test top400`	✅ Started	✅ Results
`user test this`	✅ Started	✅ Results
`run dt`	✅ Started	✅ Results
`perf test this faster`	✅ Started	👀 Results
`pack this`	✅ Started	✅ Results

typescript-bot commented 2 weeks ago

Hey @jakebailey, I've packed this into an installable tgz. You can install it for testing by referencing it in your package.json like so:

{
    "devDependencies": {
        "typescript": "https://typescript.visualstudio.com/cf7ac146-d525-443c-b23c-0d58337efebc/_apis/build/builds/161528/artifacts?artifactName=tgz&fileId=67D5A9311B7F95D1DC73F811183ED77CB70B1F5160B49B8D0BB164C993C5C00802&fileName=/typescript-5.5.0-insiders.20240429.tgz"
    }
}

and then running npm install.

There is also a playground for this build and an npm module you can use via "typescript": "npm:@typescript-deploys/pr-build@5.5.0-pr-58352-3".;

typescript-bot commented 2 weeks ago

Hey @jakebailey, the results of running the DT tests are ready.

Everything looks the same!

You can check the log here.

typescript-bot commented 2 weeks ago

@jakebailey Here are the results of running the user tests comparing main and refs/pull/58352/merge:

Everything looks good!

typescript-bot commented 2 weeks ago

@jakebailey The results of the perf run you requested are in!

Here they are:

tsc

Comparison Report - baseline..pr

_Metric	_baseline	_pr	_Delta	_Best	_Worst	_p-value
_{Compiler-Unions - node (v18.15.0, x64)}
_Errors	₃₀	₃₀	_~	_~	_~	_{p=1.000 n=6}
_Symbols	_62,154	_62,154	_~	_~	_~	_{p=1.000 n=6}
_Types	_50,273	_50,273	_~	_~	_~	_{p=1.000 n=6}
_{Memory used}	_{192,812k (± 0.80%)}	_{192,879k (± 0.73%)}	_~	_192,158k	_195,729k	_{p=0.521 n=6}
_{Parse Time}	_{1.36s (± 0.38%)}	_{1.35s (± 1.21%)}	_~	_1.32s	_1.36s	_{p=0.324 n=6}
_{Bind Time}	_0.72s	_0.72s	_~	_~	_~	_{p=1.000 n=6}
_{Check Time}	_{9.57s (± 0.30%)}	_{9.58s (± 0.48%)}	_~	_9.54s	_9.67s	_{p=0.807 n=6}
_{Emit Time}	_{2.63s (± 0.59%)}	_{2.62s (± 0.39%)}	_~	_2.60s	_2.63s	_{p=0.182 n=6}
_{Total Time}	_{14.27s (± 0.20%)}	_{14.27s (± 0.26%)}	_~	_14.23s	_14.33s	_{p=0.747 n=6}
_{angular-1 - node (v18.15.0, x64)}
_Errors	₅	₅	_~	_~	_~	_{p=1.000 n=6}
_Symbols	_945,172	_945,172	_~	_~	_~	_{p=1.000 n=6}
_Types	_408,068	_408,068	_~	_~	_~	_{p=1.000 n=6}
_{Memory used}	_{1,222,070k (± 0.00%)}	_{1,222,036k (± 0.00%)}	_{-34k (- 0.00%)}	_1,222,006k	_1,222,060k	_{p=0.045 n=6}
_{Parse Time}	_{6.92s (± 0.46%)}	_{6.96s (± 0.16%)}	_{+0.04s (+ 0.58%)}	_6.94s	_6.97s	_{p=0.008 n=6}
_{Bind Time}	_{1.87s (± 0.98%)}	_{1.85s (± 0.22%)}	_{-0.02s (- 1.07%)}	_1.85s	_1.86s	_{p=0.025 n=6}
_{Check Time}	_{31.43s (± 0.38%)}	_{31.28s (± 0.40%)}	_~	_31.09s	_31.41s	_{p=0.128 n=6}
_{Emit Time}	_{14.65s (± 0.65%)}	_{14.68s (± 0.53%)}	_~	_14.56s	_14.75s	_{p=0.748 n=6}
_{Total Time}	_{54.87s (± 0.20%)}	_{54.77s (± 0.25%)}	_~	_54.62s	_54.96s	_{p=0.199 n=6}
_{mui-docs - node (v18.15.0, x64)}
_Errors	₅	₅	_~	_~	_~	_{p=1.000 n=6}
_Symbols	_1,954,660	_1,954,660	_~	_~	_~	_{p=1.000 n=6}
_Types	_676,415	_676,415	_~	_~	_~	_{p=1.000 n=6}
_{Memory used}	_{1,753,518k (± 0.00%)}	_{1,753,526k (± 0.00%)}	_~	_1,753,516k	_1,753,549k	_{p=0.810 n=6}
_{Parse Time}	_{6.88s (± 0.31%)}	_{7.13s (± 0.30%)}	_{+0.24s (+ 3.54%)}	_7.09s	_7.15s	_{p=0.005 n=6}
_{Bind Time}	_{2.31s (± 0.45%)}	_{2.33s (± 0.36%)}	_~	_2.31s	_2.33s	_{p=0.109 n=6}
_{Check Time}	_{56.91s (± 0.30%)}	_{56.63s (± 0.52%)}	_~	_56.22s	_57.04s	_{p=0.093 n=6}
_{Emit Time}	_{0.14s (± 5.31%)}	_{0.14s (± 2.95%)}	_~	_0.13s	_0.14s	_{p=0.389 n=6}
_{Total Time}	_{66.25s (± 0.24%)}	_{66.21s (± 0.45%)}	_~	_65.80s	_66.65s	_{p=0.689 n=6}
_{self-build-src - node (v18.15.0, x64)}
_Errors	₀	₀	_~	_~	_~	_{p=1.000 n=6}
_Symbols	_1,215,567	_1,215,638	_{+71 (+ 0.01%)}	_~	_~	_{p=0.001 n=6}
_Types	_257,612	_257,615	_{+3 (+ 0.00%)}	_~	_~	_{p=0.001 n=6}
_{Memory used}	_{2,323,296k (± 0.02%)}	_{2,323,248k (± 0.02%)}	_~	_2,322,732k	_2,323,921k	_{p=0.810 n=6}
_{Parse Time}	_{5.06s (± 0.78%)}	_{5.10s (± 0.79%)}	_~	_5.04s	_5.15s	_{p=0.199 n=6}
_{Bind Time}	_{1.90s (± 1.28%)}	_{1.88s (± 1.23%)}	_~	_1.86s	_1.92s	_{p=0.257 n=6}
_{Check Time}	_{33.89s (± 0.40%)}	_{34.08s (± 0.17%)}	_{+0.19s (+ 0.55%)}	_34.02s	_34.14s	_{p=0.020 n=6}
_{Emit Time}	_{2.64s (± 0.59%)}	_{2.62s (± 0.98%)}	_~	_2.58s	_2.65s	_{p=0.106 n=6}
_{Total Time}	_{43.48s (± 0.31%)}	_{43.68s (± 0.15%)}	_{+0.19s (+ 0.44%)}	_43.58s	_43.76s	_{p=0.031 n=6}
_{self-build-src-public-api - node (v18.15.0, x64)}
_Errors	₀	₀	_~	_~	_~	_{p=1.000 n=6}
_Symbols	_1,215,567	_1,215,638	_{+71 (+ 0.01%)}	_~	_~	_{p=0.001 n=6}
_Types	_257,612	_257,615	_{+3 (+ 0.00%)}	_~	_~	_{p=0.001 n=6}
_{Memory used}	_{2,397,259k (± 0.00%)}	_{2,397,500k (± 0.02%)}	_~	_2,396,852k	_2,398,164k	_{p=0.230 n=6}
_{Parse Time}	_{6.30s (± 0.63%)}	_{6.36s (± 0.84%)}	_~	_6.30s	_6.45s	_{p=0.078 n=6}
_{Bind Time}	_{2.01s (± 0.61%)}	_{2.00s (± 0.84%)}	_~	_1.98s	_2.03s	_{p=0.131 n=6}
_{Check Time}	_{40.65s (± 0.19%)}	_{40.70s (± 0.34%)}	_~	_40.54s	_40.89s	_{p=0.378 n=6}
_{Emit Time}	_{3.13s (± 1.41%)}	_{3.15s (± 1.09%)}	_~	_3.10s	_3.19s	_{p=0.228 n=6}
_{Total Time}	_{52.12s (± 0.26%)}	_{52.24s (± 0.30%)}	_~	_52.01s	_52.45s	_{p=0.128 n=6}
_{self-compiler - node (v18.15.0, x64)}
_Errors	₀	₀	_~	_~	_~	_{p=1.000 n=6}
_Symbols	_256,196	_256,212	_{+16 (+ 0.01%)}	_~	_~	_{p=0.001 n=6}
_Types	_103,640	_103,643	_{+3 (+ 0.00%)}	_~	_~	_{p=0.001 n=6}
_{Memory used}	_{424,218k (± 0.01%)}	_{424,265k (± 0.00%)}	_{+47k (+ 0.01%)}	_424,251k	_424,275k	_{p=0.013 n=6}
_{Parse Time}	_{3.48s (± 0.66%)}	_{3.50s (± 0.52%)}	_~	_3.48s	_3.53s	_{p=0.124 n=6}
_{Bind Time}	_{1.31s (± 1.74%)}	_{1.30s (± 0.49%)}	_~	_1.29s	_1.31s	_{p=0.615 n=6}
_{Check Time}	_{18.19s (± 0.41%)}	_{18.20s (± 0.35%)}	_~	_18.10s	_18.30s	_{p=0.936 n=6}
_{Emit Time}	_{1.38s (± 1.28%)}	_{1.38s (± 1.64%)}	_~	_1.34s	_1.41s	_{p=0.934 n=6}
_{Total Time}	_{24.36s (± 0.33%)}	_{24.38s (± 0.31%)}	_~	_24.25s	_24.47s	_{p=0.630 n=6}
_{ts-pre-modules - node (v18.15.0, x64)}
_Errors	₃₅	₃₅	_~	_~	_~	_{p=1.000 n=6}
_Symbols	_224,824	_224,824	_~	_~	_~	_{p=1.000 n=6}
_Types	_93,390	_93,390	_~	_~	_~	_{p=1.000 n=6}
_{Memory used}	_{369,280k (± 0.01%)}	_{369,339k (± 0.01%)}	_{+59k (+ 0.02%)}	_369,278k	_369,410k	_{p=0.037 n=6}
_{Parse Time}	_{3.68s (± 0.53%)}	_{3.67s (± 0.89%)}	_~	_3.63s	_3.72s	_{p=0.468 n=6}
_{Bind Time}	_{1.92s (± 1.21%)}	_{1.94s (± 1.01%)}	_~	_1.92s	_1.96s	_{p=0.411 n=6}
_{Check Time}	_{19.39s (± 0.29%)}	_{19.45s (± 0.51%)}	_~	_19.31s	_19.57s	_{p=0.172 n=6}
_{Emit Time}	_0.00s	_0.00s	_~	_~	_~	_{p=1.000 n=6}
_{Total Time}	_{24.98s (± 0.28%)}	_{25.06s (± 0.31%)}	_~	_24.95s	_25.16s	_{p=0.149 n=6}
_{vscode - node (v18.15.0, x64)}
_Errors	₄	₄	_~	_~	_~	_{p=1.000 n=6}
_Symbols	_2,797,340	_2,797,340	_~	_~	_~	_{p=1.000 n=6}
_Types	_950,105	_950,105	_~	_~	_~	_{p=1.000 n=6}
_{Memory used}	_{2,925,439k (± 0.00%)}	_{2,925,543k (± 0.00%)}	_~	_2,925,445k	_2,925,654k	_{p=0.128 n=6}
_{Parse Time}	_{16.70s (± 0.48%)}	_{16.91s (± 0.27%)}	_{+0.21s (+ 1.26%)}	_16.86s	_16.98s	_{p=0.005 n=6}
_{Bind Time}	_{5.01s (± 2.07%)}	_{5.11s (± 2.48%)}	_~	_4.94s	_5.21s	_{p=0.574 n=6}
_{Check Time}	_{88.76s (± 0.44%)}	_{88.91s (± 0.43%)}	_~	_88.18s	_89.16s	_{p=0.378 n=6}
_{Emit Time}	_{24.58s (± 7.30%)}	_{23.92s (± 0.57%)}	_~	_23.75s	_24.06s	_{p=1.000 n=6}
_{Total Time}	_{135.06s (± 1.57%)}	_{134.85s (± 0.22%)}	_~	_134.28s	_135.08s	_{p=0.109 n=6}
_{webpack - node (v18.15.0, x64)}
_Errors	₀	₀	_~	_~	_~	_{p=1.000 n=6}
_Symbols	_265,853	_265,853	_~	_~	_~	_{p=1.000 n=6}
_Types	_108,438	_108,438	_~	_~	_~	_{p=1.000 n=6}
_{Memory used}	_{410,420k (± 0.02%)}	_{410,488k (± 0.02%)}	_~	_410,372k	_410,577k	_{p=0.199 n=6}
_{Parse Time}	_{4.88s (± 0.70%)}	_{4.88s (± 1.16%)}	_~	_4.83s	_4.99s	_{p=0.687 n=6}
_{Bind Time}	_{2.06s (± 0.85%)}	_{2.07s (± 0.68%)}	_~	_2.05s	_2.09s	_{p=0.743 n=6}
_{Check Time}	_{21.16s (± 0.32%)}	_{21.15s (± 0.34%)}	_~	_21.01s	_21.21s	_{p=0.810 n=6}
_{Emit Time}	_0.00s	_0.00s	_~	_~	_~	_{p=1.000 n=6}
_{Total Time}	_{28.10s (± 0.33%)}	_{28.10s (± 0.38%)}	_~	_27.93s	_28.26s	_{p=1.000 n=6}
_{xstate-main - node (v18.15.0, x64)}
_Errors	₀	₀	_~	_~	_~	_{p=1.000 n=6}
_Symbols	_523,981	_523,981	_~	_~	_~	_{p=1.000 n=6}
_Types	_178,708	_178,708	_~	_~	_~	_{p=1.000 n=6}
_{Memory used}	_{461,265k (± 0.03%)}	_{461,254k (± 0.02%)}	_~	_461,164k	_461,351k	_{p=1.000 n=6}
_{Parse Time}	_{3.24s (± 0.58%)}	_{3.24s (± 0.42%)}	_~	_3.22s	_3.26s	_{p=0.622 n=6}
_{Bind Time}	_{1.17s (± 0.44%)}	_{1.17s (± 0.64%)}	_~	_1.16s	_1.18s	_{p=0.784 n=6}
_{Check Time}	_{18.15s (± 0.37%)}	_{18.23s (± 0.24%)}	_~	_18.16s	_18.27s	_{p=0.052 n=6}
_{Emit Time}	_0.00s	_0.00s	_~	_~	_~	_{p=1.000 n=6}
_{Total Time}	_{22.57s (± 0.25%)}	_{22.64s (± 0.16%)}	_{+0.07s (+ 0.32%)}	_22.59s	_22.69s	_{p=0.037 n=6}

System info unknown

Hosts

node (v18.15.0, x64)

Scenarios

Compiler-Unions - node (v18.15.0, x64)
angular-1 - node (v18.15.0, x64)
mui-docs - node (v18.15.0, x64)
self-build-src - node (v18.15.0, x64)
self-build-src-public-api - node (v18.15.0, x64)
self-compiler - node (v18.15.0, x64)
ts-pre-modules - node (v18.15.0, x64)
vscode - node (v18.15.0, x64)
webpack - node (v18.15.0, x64)
xstate-main - node (v18.15.0, x64)

_Benchmark	_Name	_Iterations
_Current	_pr	₆
_Baseline	_baseline	₆

Developer Information:

Download Benchmarks

typescript-bot commented 2 weeks ago

@jakebailey Here are the results of running the top 400 repos comparing main and refs/pull/58352/merge:

Everything looks good!

sheetalkamat commented 2 weeks ago

When we reuse a program, we keep using that same list, but may rediscover the same files over and over again, pushing more and more into that mapping.

Thats not true.. Unless program is completely reused the fileReasons from old program are not reused so they should not change between programs.
What you have fixed here seems like a issue where file(a) may be included by say import from one of the file(b) that was imported though different root file(c). Then if that b is also root file, we will process its imports again so file c is added with same reason and that is what this deduplication is handling..

If you look for fileReasons it does not change between programs. Its only changed through findSourceFileWorker and that is not called when we are trying to determine if program can be reused. (tryReuseProgramStructure)

jakebailey commented 2 weeks ago

Gotcha; the deduping is somewhat nice though, but if it's not actually the problem then the deduping is probably just plain worse for perf and I should just ditch this PR.

jakebailey commented 1 week ago

Closing in favor of #58398, which includes some aspects of this PR.

microsoft / TypeScript

Ensure equal FileIncludeReasons are not duplicitively added #58352

tsc