rust-lang / rust-analyzer

A Rust compiler front-end for IDEs
https://rust-analyzer.github.io/
Apache License 2.0
14.12k stars 1.58k forks source link

A Plan for Making Rust Analyzer Faster #17491

Open davidbarsky opened 3 months ago

davidbarsky commented 3 months ago

Some background context:

Anyways, here's non-exhaustive (but decently comprehensive!) list of things that we'd like to fix.

Non-Parallel, Single-Threaded Improvements

Parallel Improvements

From a sequencing standpoint, most parallelism improvements in rust-analyzer should happen after single-threaded improvements occurs, as introducing parallelism without making single-core performance improvements simply papers over existing, resolvable performance issues—I believe it is better to have those improvements compound. Besides, for a large number of these changes below, are waiting on Salsa 3.0, as it will enable trivial parallel computation for all databases (https://github.com/salsa-rs/salsa/issues/495) [^1].


I'll add more items as they come up, but I figured I should combine the different discussion topics into a single location.

[^1]: The version of Salsa used by rust-analyzer todays requires getting a snapshot from the ParallelDatabase trait, and ParallelDatabase is only implemented on the RootDatabase. Similarly, parallel query dependencies are not legible to Salsa, which complicates the invalidation story.

Wilfred commented 3 months ago

FWIW I see parallelism fixes as strictly orthogonal. Some are blocked on salsa, but parallel crate graph loading or parallel VFS loading could happen today, and I think they could be decent performance wins.

lnicola commented 3 months ago

Switch from an LRU cache to a scan-resistant cache (LRU-K)

Regarding this, I think the original Salsa used a scan-resistant strategy (SLRU?), see https://github.com/rust-lang/rust-analyzer/blob/ac389ce2ef9fa65edd3aa840648ae4a6637a4040/crates/salsa/src/lru.rs#L228, but it was dropped in favor of https://docs.rs/lru-cache/ in 2.0 / 2022 (and 3.0 too). I don't know how effective it was, though.

Veykril commented 3 months ago

During project loading, rust-analyzer repeatedly queries the project layout from rustc repeatedly, as there isn’t a clear end-state for project loading. Deferring this until the very end would likely help with loading large Rust workspaces.

Not sure I fully understand this, but if I do, then I think this is caused by the VFS being processed fairly slowly. Whenever a file gets created (or something along the lines, which is the case on start up for all files we load), we will requery the project layout. On fairly large projects the VFS just takes ages to report and be fully loaded.

Veykril commented 3 months ago

This isn’t the most fair comparison, as Rowan is optimized for editing trees—hence the structural sharing—but a syntax tree library structure with a more contiguous representation of the trees themselves would mean that the initial workspace loading/indexing could be substantially faster.

Given I'm of the opinion that the mutable API is more problematic than useful, we should definitely go for a flat structure I think. Edits / Construction of trees from multiple trees being more expensive shouldn't be a big problem I think, there is little that would care about that, assists are lazy in terms of their edits so there is no problem with that there, and aside from those not much really does tree style edits in the first place! So we should absolutely replace the current rowan -> https://github.com/rust-lang/rust-analyzer/issues/6584 (also to note, rowan makes heavy use of the header pattern which is unsound under SB/TB so swapping that would generally be good in that regard as well)

davidbarsky commented 3 months ago

Not sure I fully understand this, but if I do, then I think this is caused by the VFS being processed fairly slowly.

I think the VFS is partly responsible, but I've come to this conclusion by through a combination of debugging this and setting "RA_LOG": "rust_analyzer=info". When opening rust-analyzer, I saw 40 instances of INFO did fetch workspaces workspaces=[Ok(Json { n_crates: 230, n_sysroot_crates: 11, n_rustc_cfg: 36, toolchain: Some(Version { major: 1, minor: 78, patch: 0 }), data_layout: Ok("e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"), n_cfg_overrides: 2 })] followed by GlobalState::switch_workspaces switching workspaces.

I also did see a lot of GlobalState::handle_event{event=Event::Vfs}: task queue len: 1 (963, to be exact), but I'm not sure if that's evidence of the VFS being slow or rust-analyzer trying to be concurrent when, strictly speaking, it doesn't really need to be.

Whenever a file gets created (or something along the lines, which is the case on start up for all files we load), we will requery the project layout.

I have observed this, and while that certainly doesn't help, I think that fixing that issue would be pretty easy if it were possible for the VFS (at least during initial indexing) to read all sources at once. I wrote a small, incomplete file loader and ran it against buck2's rust-project.json. It "loaded" all 876 crates in 1.5 seconds on a cold file system and 1.14 seconds on a warm file system, whereas rust-analyzer normally takes over a minute for the same kind work. I think that one viable approach would be to add some sort of bulk loading functionality to the VFS that is able to create the graph once.

lnicola commented 2 weeks ago

Re flat syntax trees, leaving this here: https://github.com/saschagrunert/indextree.