Reusing nodeLinks and symbolLinks

yf-yang commented 1 year ago

This suggestion has the same settings with and is on top of #54222 : Language server, assuming only few characters are modified and others are remaining untouched. With incremental parser and some additional flags, we are able to distinguish "clean area" and "dirty area" of nodes. With incremental binding, there exists clean and dirty symbols (we are not able to distinguish them yet).

Semantic analysis are performed by the type checker lazily, after they are computed, intermediate or final results are cached by nodeLinks/symbolLinks. Each node/symbol has their own cache, but the caches are actually stored program-wide, nodes/symbols need to query their own by their id.

Whenever a change occurs, type checker applies an aggresive strategy: it assumes all the caches are outdated and recreates nodeLinks and symbolLinks, and we can actually reuse some of them.

The core step of semantic analysis is resolution, which refers to the computation needed to build the connection of a variable reference node and its symbol (declaration). Some cache needs multiple resolution steps (For example, get properties of a deep inherited class). After resolution, some computation is performed to get a type or other stuff. I'll start with those caches that involves one resolution step. Also, when considering performance, I'd like to ignore the overhead of resolution between two files, just theoratically describe what procedure could be bypassed, in order to reduce factors got involved.

Reuse one step resolution

There are four scenarios:

clean node, resolves to clean node/symbol Great, reuse the cache.
dirty node, resolves to dirty/clean node/symbol The nodes are new, so the caches is initialized as undefined, then lazy computation is performed if needed. Memory leak now becomes a problem, because we no longer clear nodeLinks/symbolLinks. weakMap could be a possible workaround.
clean node, resolves to dirty node/symbol Type checker accesses the node, finds it has a cache, then finds it points to a dirty node/symbol, so the cache is recomputed.

Now we have a problem: how to tell if the node/symbol is dirty or clean? Dirty nodes/symbols are detached from new AST/symbol tables, we do not have a chance to visit them, but they are not released by those caches.

Here I offer one possible implementation: use file version. Each node/symbol would have a "version" member. During incremental parsing phase, initialize/refresh version members of each node to the latest, incremental parsing stage is similar. Note that the dirty flag I mentioned in #54222 could be removed. Then, in type checker, when it finds a cache is available, compare its version to the file version.

I am not sure if it is easy to implement, because it means each cache entry access (specifically, each key access in xxlinks) needs to be modified. As far as I know, they should be updated one by one because there are tons of code patterns. Anyway, I'm theoretically discussing the idea here.

What about multi-step resolution

If we want to reuse multi-step resolution cache, then we need to store more information about the chain in the node containing the cache, which may not be a good idea. Even a simpler example that a node points to a type with one step resolution is not viable, because we can't tag the version of the cached type.

However, for those scenarios, I'd like to compare them with our baseline, the existing strategy: clear everything, recompute anything when you need it. Suppose for these scenarios, we just choose to recompute it, we can still reuse those one-step resolutions, which is better than reusing nothing at least.

Therefore, the complete suggestion is: identify those cache entry that both cache holder and cached target could be simply versioned (For example, NodeLinks.resolvedSymbol and try to reuse them. For others, we can still keep the same strategy.

Tradeoff of memory usage

Note that dirty cached target is not released until type checker visits it again, so it may introduces higher memory usage. However, I think the tradeoff is minor, because we can have other strategies, such as performing a nodeLink/symbolLink scan every 200 versions and removing stale nodes/symbols.

Maybe starting from files

The idea is not complicated, but the codebase is, so I'd suggest an alternative:

We can start from reusing nodeLinks/symbolLinks between clean files. Files not get binded are indentified as "clean area" and rebinded files are "dirty area", for such granularity, it is already there.

RyanCavanaugh commented 1 year ago

I'm not sure this is quite feasible. The results computed from symbol links and node links do not necessarily have the same meaning in a future program that has edits, even if those edits are outside the current file.

The other problem is that most of these node links and symbol links are not actually very computationally intensive. So in an interactive scenario, which is the only place you could do incremental updates, the amount of work being done here is actually quite minimal.

We could take a PR to examine it and see if there are meaningful gains, but I don't expect to be able to find much here.

yf-yang commented 1 year ago

Sure. I don't know how to benchmark a language server (do you have any guides? I only find performance tracing for tsc in the wiki.

However, I did try to observe (in a naive way) the performance. Say I am clicking a random function in compiler/binder.ts and click go to references in vsc, then I can feel lag (usually 0.1s? 0.2s? sometimes 1s? I have no idea what the exact lag is, not so much lag). For the second time I do the same thing, it pops up in a flash. If I randomly insert some characters in another file (to clear the cache), the lag comes back again.

I am using a 2.4 GHz 8-Core Intel i9 Macbook pro with 32GB RAM, but the test is running within a devcontainer with 6 cores and 24G RAM. Anyway, I assume my development environment's performance is above average, but I have no idea whether or not it is something worth to work on.

Absolutely it is naive, if you can provide some reference of what is identified as "notable performance optimization" then maybe I can do something more standard in advance 😄

jakebailey commented 1 year ago

If you're just performance benchmarking tsc, then all our perf suite does is run tsc --extendedDiagnostics. If you are benchmarking tsserver requests, then that can be orchestrated with https://www.npmjs.com/package/@typescript/server-harness.

microsoft / TypeScript