This is the result of applying this technique of boxing/deduping GreenToken to compress GreenElement and, as a result, the allocation for a GreenNode's children. It also applies a minimal flexible array member technique for the GreenNode's children, allocating them inline.
This is as far as the technique can go with using the standard Arc and NodeOrToken. GreenElement = NodeOrToken<Arc<GreenNode>, Arc<GreenToken>> is 3xusize large, but can theoretically be niched to 2xusize. (NB: the same niche optimization can be applied to SyntaxElement.)
I've split this PR up into commits each representing a logical step.
It is possible to optimize size further, but I'd like to propose those in a second PR on top of this one. Remaining potential optimization steps along these lines:
Implement a specific GreenElement to manually niche Arc<GreenNode> and Arc<GreenToken>.
Oh and by the way: I ran cargo miri test and everything passes here. Unfortunately, it does appear that miri does not like flexible array members (for further optimization).
This is the result of applying this technique of boxing/deduping
GreenToken
to compressGreenElement
and, as a result, the allocation for aGreenNode
's children. It also applies a minimal flexible array member technique for theGreenNode
's children, allocating them inline.This is as far as the technique can go with using the standard
Arc
andNodeOrToken
.GreenElement = NodeOrToken<Arc<GreenNode>, Arc<GreenToken>>
is 3xusize
large, but can theoretically be niched to 2xusize
. (NB: the same niche optimization can be applied toSyntaxElement
.)I've split this PR up into commits each representing a logical step.
It is possible to optimize size further, but I'd like to propose those in a second PR on top of this one. Remaining potential optimization steps along these lines:
GreenElement
to manually nicheArc<GreenNode>
andArc<GreenToken>
.sizeof(GreenElement)
= 2xusize
.SyntaxElement
to manually nicheSyntaxNode
andSyntaxToken
.sizeof(SyntaxElement)
= 2xusize
.Arc<GreenNode>
to make ptr-to-GreenNode
a thin ptr instead of a fat ptr.GreenElement = NodeOrToken
would be 2xusize
. Manually non-zero-cost nichedGreenElement
(tag in alignment bits) would be 1xusize
.Results in rust-analyzer:
With this branch
```powershell PS D:\usr\Documents\Code\Rust\rust-analyzer> cargo run --bin ra_cli --release -- analysis-stats . Finished release [optimized + debuginfo] target(s) in 0.46s Running `target\release\ra_cli.exe analysis-stats .` Database loaded, 221 roots, 1.0601376s Crates in this dir: 27 Total modules found: 331 Total declarations: 11135 Total functions: 3839 Item Collection: 11.8499283s, 0b allocated 0b resident Total expressions: 89244 Expressions of unknown type: 6960 (7%) Expressions of partially unknown type: 3522 (3%) Type mismatches: 3568 Inference: 36.3460289s, 0b allocated 0b resident Total: 48.1964408s, 0b allocated 0b resident PS D:\usr\Documents\Code\Rust\rust-analyzer> Measure-Command { type "D:\rust-lang\src\libcore\unicode\tables.rs" | cargo run --bin ra_cli --release -- symbols } Finished release [optimized + debuginfo] target(s) in 0.45s Days : 0 Hours : 0 Minutes : 0 Seconds : 0 Milliseconds : 738 Ticks : 7380228 TotalDays : 8.54193055555555E-06 TotalHours : 0.000205006333333333 TotalMinutes : 0.01230038 TotalSeconds : 0.7380228 TotalMilliseconds : 738.0228 PS D:\usr\Documents\Code\Rust\rust-analyzer> Measure-Command { type "D:\rust-lang\src\libcore\unicode\tables.rs" | cargo run --bin ra_cli --release -- parse --no-dump } Finished release [optimized + debuginfo] target(s) in 0.44s Running `target\release\ra_cli.exe parse --no-dump` Days : 0 Hours : 0 Minutes : 0 Seconds : 0 Milliseconds : 603 Ticks : 6035426 TotalDays : 6.98544675925926E-06 TotalHours : 0.000167650722222222 TotalMinutes : 0.0100590433333333 TotalSeconds : 0.6035426 TotalMilliseconds : 603.5426 ```Without this branch (5451bfb9)
```powershell PS D:\usr\Documents\Code\Rust\rust-analyzer> cargo run --bin ra_cli --release -- analysis-stats . Finished release [optimized + debuginfo] target(s) in 0.45s Running `target\release\ra_cli.exe analysis-stats .` Database loaded, 220 roots, 1.0174838s Crates in this dir: 27 Total modules found: 331 Total declarations: 11135 Total functions: 3839 Item Collection: 10.509602s, 0b allocated 0b resident Total expressions: 89241 Expressions of unknown type: 6959 (7%) Expressions of partially unknown type: 3522 (3%) Type mismatches: 3569 Inference: 34.963529s, 0b allocated 0b resident Total: 45.4737377s, 0b allocated 0b resident PS D:\usr\Documents\Code\Rust\rust-analyzer> Measure-Command { type "D:\rust-lang\src\libcore\unicode\tables.rs" | cargo run --bin ra_cli --release -- symbols } Finished release [optimized + debuginfo] target(s) in 0.44s Days : 0 Hours : 0 Minutes : 0 Seconds : 0 Milliseconds : 587 Ticks : 5875475 TotalDays : 6.80031828703704E-06 TotalHours : 0.000163207638888889 TotalMinutes : 0.00979245833333333 TotalSeconds : 0.5875475 TotalMilliseconds : 587.5475 PS D:\usr\Documents\Code\Rust\rust-analyzer> Measure-Command { type "D:\rust-lang\src\libcore\unicode\tables.rs" | cargo run --bin ra_cli --release -- parse --no-dump } Finished release [optimized + debuginfo] target(s) in 0.44s Running `target\release\ra_cli.exe parse --no-dump` Days : 0 Hours : 0 Minutes : 0 Seconds : 0 Milliseconds : 573 Ticks : 5737453 TotalDays : 6.64057060185185E-06 TotalHours : 0.000159373694444444 TotalMinutes : 0.00956242166666667 TotalSeconds : 0.5737453 TotalMilliseconds : 573.7453 ```I can't test allocation pressure on Windows. The way this is right here, it looks like a consistent loss.