Closed CAD97 closed 4 years ago
https://github.com/rust-analyzer/rust-analyzer/pull/2182 confirms that this decreases rowan memory usage, with ParseQuery
dropping from 30MB on master, to 18MB with #34, and 9MB with this patch.
The latest commit returns GreenNode
to being actually !Sized
, but keeps the benefit of thin pointers in ArcGreenNode
and GreenElement
by storing type-erased thin pointers and recovering the length from the header.
This passes cargo miri test
now 🎉 We should probably run miri in CI to make sure it stays that way.
@matklad I believe this is ready for review now. I think we should leave the enum NodeOrToken
niche optimization to the compiler to eventually pick up; we'd want to revert to the safe code once it does anyway, so I don't think it's worth the short term code clarity cost to manually take advantage of that niche.
Review hint: each commit should be self contained except a392206 and 1f42bcb, which need to be taken together. Review will probably be easier piecewise by commit (except for that one exception).
@CAD97 did you by any change do any measurements of how effective the interning is? Ie, if we parse something like the old, non-split version of parser.rs from rustc, how many tokens and nodes are there, and which proportion of tokens is deduped? "30MB >18MB > 9MB" size reduction looks super sweet, up to the point it's hard to believe in :) It would be cool to double-check it with back-of-the-envelope calculations
It's also interesting that SmloStr
now becomes completely unnecessary: if we heap allocate tokens anyway, it's better to store token text inline, using the trailing objects trick h.
This now uses thin-dst. It could use some cleanup, but it also fully encapsulates ArcGreenNode
and thin-dst; the outside world only sees the std Arc.
Back-of-the-envelope space savings calculation:
GreenElement
went from 40 bytes to 8 bytes for a saving of 32 bytes per element. As a very rough underestimate of the number of nodes seen by ra analysis-stats ra
, modules + declarations + functions + expressions = 104,548 nodes. That's 3,345,536 bytes saved, or 3.2MiB. For every non-deduped token, we go -8 on bytes allocated, as we go from a 40 byte GreenElement
with the token inline, to a 8 byte GreenElement
pointer to a 48 byte ArcInner<GreenToken>
. For each deduped token, we save 32 bytes. Every node has at least one token, and statistically this severe underestimation is entirely deduped. So that's another 3.2MiB saved.
So an extremely conservative low-effort estimation puts this at at least a 20% memory savings. For a better estimation, we should hook ra analysis-stats to print an estimation of the full node count. A quick way would be to just annotate the cache to log cache hits and misses.
So, @matklad, if this all looks good to you, I'll publish version 1.0.0 of thin-dst (transfer ownership to the rust-analyzer org?) and bump this PR to depend on that so we can start realizing these benefits.
It'd still be cool to make GreenToken
a ThinData
as well with the trailing str instead of SmolStr
, but that should be a followup patch to this imho.
Yeah, the approach sounds like something we should just do!
Publishing 1.0 of thin-dst seems like a good first-step. I haven't looked at this PR closely, but one thing we want to do is to declare GreenNode like struct GrenNode(Arc<GreenNodeData>)
. Hiding the arc inside seems good in general, and should allow to implmenet https://github.com/rust-analyzer/rust-analyzer/pull/2182/files by just changing the Cargo.toml (Ie, I expect to publish a semver-compatible version of rowan)
BTW, I've finished most of crucial refactorings in rust-analyzer for the time being, so I'd love to prioritize this work!
I've got two essays due tomorrow (😨) but after that I can definitely finish this up real quick.
This patch cannot be semver-compatible, as GreenElement
is no longer just an alias for NodeOrToken
and we expose &[GreenElement]
.
That said, I've come around to fully encapsulating the arc in GreenNode
and should be able to make that change tomorrow.
Oh, good catch! I should have exposed
impl Iterator<Item = NodeOrToken<&GreenNode, &GreenToken>> + ExactSizeIterator
As an aside, one of the lessons from iterating through a doesn of design for syntax trees is that enums are a very easy way to expose impl details in a way you can't just fix later
Now that we know the proper direction to take this in (and it needs to be completely refactored to do as such), I'm closing this to PR more piecewise improvement where possible. The intent is to PR the breaking changes and then make the thin-dst backend refactor entirely encapsulated.
The intent is to PR the breaking changes
That's an excellent idea! I'll publish it and make a PR to rust-analyzer, so that we can trivially compare perf
Implements a thin-pointer version of
GreenNode
and uses that to implement a 1×usize
GreenElement
. We do not yet apply any manual niching toSyntaxElement
, though it still could be niched to 2×usize
at zero performance cost.We provide a
match_element!
macro that allows for (what looks like) pattern matching overGreenElement
even though it is no longer an enum.This passes miri! 🎉 We should probably add miri to CI to make sure it stays that way.