Investigate ways to reduce memory usage

rust-lang / rust-analyzer

A Rust compiler front-end for IDEs

https://rust-analyzer.github.io/

Apache License 2.0

14.24k stars 1.6k forks source link

Investigate ways to reduce memory usage #7330

Open matklad opened 3 years ago

matklad commented 3 years ago

Rust analyzer uses quite a bit of ram, what can we do about it?

Part of https://github.com/rust-analyzer/rust-analyzer/issues/7325

The goal here is to understand where all the bytes go.

Steps:

verify that heavy ItemTrees are reasonable. It still seems to me that item tree shouldn't weight much more that the sourece code. It is less compact, but it stores only items. What is the ratio of text.len() vs total_size_of(item_tree) for each file?
quantity the impact of macros. It feels that macros create a lot of small item trees, how much smaller we'd get if we just don't expand derives/top-level macros which don't include mods?
are there any low hanging fruits?
try spilling item trees to disk. Do not integrate with salsa just yet, only use some quick hack to put them to disk? Maybe just encode and compress in memory? Do we need abomination-friendly item tree structere?
how does the size of IntelliJ indexes(on-disk repr) compares with rust-analyzer's ram

matklad commented 3 years ago

cc @jonas-schievink

jonas-schievink commented 3 years ago

A significant amount of ItemTree memory is used by path data stored within TypeRefs. I've added simple per-ItemTree interning of TypeRefs in https://github.com/rust-analyzer/rust-analyzer/pull/7557, but they still take up considerable space. We should investigate:

Global Arc-based interning outside of salsa
Interning Paths, not just TypeRefs
Optimizing space usage for common usage patterns
- For example, single-segment paths without any extra data are very common
- Given that expanding macros in type position will likely require duplicating TypeRef, that might be a good way to limit the fallout of the changes to just ItemTree and the code to convert to the HIR TypeRef

jonas-schievink commented 3 years ago

Most of my suggestions in https://github.com/rust-analyzer/rust-analyzer/issues/7330#issuecomment-776927543 have now been implemented in https://github.com/rust-analyzer/rust-analyzer/pull/8284

memoryruins commented 3 years ago

https://github.com/rust-analyzer/rust-analyzer/pull/8433 by @flodiebold

This uses the new interning infrastructure for most type-related things, where it had a positive effect on memory usage and performance. In total, this gives a slight performance improvement and a quite good memory reduction (1119MB->885MB on RA, 1774MB->1188MB on Diesel).

matklad commented 3 years ago

Got curious about how we compare to other similar tools. So I tried comparing rust-analyzer memory usage with running cargo check and with JetBrains on-disk caches of the rust-analyzer project itself. So I did:

rm -rm target && cargo check && cat target/** > target
rm -rf ~/.cache/JetBrains && open CLion and wait for indexing && cat ~/.cache/JetBrains/** > jbcache
cat "rust-analyzer memory with https://serverfault.com/questions/173999/dump-a-linux-processs-memory-to-file" > ramem

Then I compressed each one of those with snappy:

Here's the result:

144M  jbcache
117M  jbcache.sz

1.0G  ramem
142M  ramem.sz

613M  target
316M  target.sz

So it seems like in terms of actual data we store, we are roughly on the right order of magnitude. However, the image of data in memory is really inefficient.

To put the numbers into perspective, here's the amount of the source code we are working with:

42M  src
12M  src.sz

matklad commented 3 years ago

Just for fun, here's memory image of CLion

1.8G  clmem
516M  clmem.sz

It compresses much worse than our image.

Logarithmus commented 3 years ago

Just for fun, here's memory image of CLion
1.8G  clmem
516M  clmem.sz
It compresses much worse than our image.

What's *.sz? Is it some new compression format like *.gz or something?

bjorn3 commented 3 years ago

I guess snappy.