Closed matklad closed 5 years ago
I've really only used the SyntaxNode
kind, so I think if we're removing Send
+ Sync
from there we'll need a way to convert it back to a GreenNode
and back. Otherwise, I love the idea of using Rc
for most computation :)
preliminary testing (parsing parser.rs and debug-dumping it to dev/null) show that this is indeed significantly more memory efficient and slightly faster
bors r+
perf measuremets in https://github.com/rust-analyzer/rust-analyzer/pull/1545
bors r+
I've really only used the SyntaxNode kind, so I think if we're removing Send + Sync from there we'll need a way to convert it back to a GreenNode and back.
Curiously, in the whole of rust-analyzer this was ever needed for root nodes, so a type which stores syntax errors as well is used for this: https://github.com/rust-analyzer/rust-analyzer/blob/cf932181cf18c1e5fa99cb024bdb92784ed9649e/crates/ra_syntax/src/lib.rs#L60-L65
It's been a long time since I've re-written the core of rowan, so I figure out this is a high time for yet another rewrite! :)
The main motivation here is memory usage: rowan trees weigh a ton. Specifically, it is SyntaxNodes (red trees) which are heavy:
An old thread about roslyn is illuminating here (thanks @KirillOsenkov for documenting this!). The way they solve this problem in roslyn is by using weak references for some red-tree links (in particular, function bodies are stored using weak refs). This allows GC to collect red trees while maintaining reference identity, which is super cool.
Unfortunately, we can't easily use weak refs in rust: the central property of the previous solution was that traversal always yields a plain reference, and we can't track if a plain reference is used or not.
So, this PR proposes a radically different approach for handing
SyntaxNode
s. Instead of lazily allocating red nodes in arena, we create and tear down them on the fly during traversal. That is, at every instant during traversal we only keep in memory the red path to the root. Naively, this can be implemented asThat would be pretty inefficient though: traversal would require a ton of allocations and atomic reference count increments!
So we do something horrible and beautiful instead: we make
SyntaxNode
s notSend/Sync
. This allows us to:Rc
instead of anArc
SyntaxNode
s!Send
seems strange and limiting, but should work fine in practice I think: the green tree itself is perfectly thread safe still, it's only a particular tree traversal that needs to happen on a specific node. If you want to send a particular node to the other thread, you can send the root node together with a "pointer" (text span) inside the tree.On the other hand, traversal becomes:
This still seems costlier than our previous reference-based traversal, but I am not sure that's the case without the benchmarks:
Benchmarking this properly will require adapting rust-analyzer to the new API :)