Closed CAD97 closed 4 years ago
ra is ported to use this branch at https://github.com/rust-analyzer/rust-analyzer/pull/2104 for perf analysis.
Still don't have time to take a closer look now, but I remember a couple of ideas I had to deal with big tokens:
|node header|child node 1|child node 2|child node3|token 1|token 2|token3|
The latter approach is interesting, because it can, in theory, give us cheap O(1)
access to the nth child. So, something like trait.name
would do something like self.children().get(3)
, rather than the current iteration.
To sketch the layout/sizes that we currently have (just looking at usize=u64
):
master
:
GreenElement { // align 8, size 40
tag: enum { Node, Token }, // align 1/8, size 1/8
// padding 7_7/8
payload: union { // align 8, size 32
node: GreenNode, // align 8, size 24
token: GreenToken, // align 8, size 32
}
}
GreenNode { // align 8, size 24
kind: SyntaxKind, // align 2, size 2
// padding 2
text_len: TextUnit, // align 4, size 4
children: Arc<[GreenElement]>, // align 8, size 16
}
GreenToken { // align 8, size 32
kind: SyntaxKind // align 2, size 2
text: SmolStr, // align 8, size 24
}
"Two array":
GreenNode { // align 8, size dyn
text_len: TextUnit, // align 4, size 4
child_count: u16, // align 2, size 2
node_kind: b(child_count), // `child_count` bits to specify interleaving of nodes/tokens
// this also gives us the sizes of nodes/tokens via count_ones/count_zeros
// padding to 8; none when `child_count % 64 == 16`
nodes: [Arc<GreenNode>], // align 8, size count*16
tokens: [GreenToken], // align 8, size count*32
}
// note: vla would be `[u8]` from rustc's POV and encompass everything after `child_count`
GreenToken { // align 8, size 32
kind: SyntaxKind // align 2, size 2
text: SmolStr, // align 8, size 24
}
So, something like trait.name would do something like self.children().get(3), rather than the current iteration.
TRAIT_DEF@[0; 16)
TRAIT_KW@[0; 5) "trait"
WHITESPACE@[5; 6) " "
NAME@[6; 7)
IDENT@[6; 7) "X"
WHITESPACE@[7; 8) " "
ITEM_LIST@[8; 16)
L_CURLY@[8; 9) "{"
WHITESPACE@[9; 15) "\n \n"
R_CURLY@[15; 16) "}"
TRAIT_DEF@[0; 16)
VISIBILITY@[0; 3)
PUB_KW@[0; 3) "pub"
WHITESPACE@[3; 4) " "
TRAIT_KW@[4; 9) "trait"
WHITESPACE@[9; 10) " "
NAME@[10; 11)
IDENT@[10; 11) "X"
WHITESPACE@[11; 12) " "
ITEM_LIST@[12; 16)
L_CURLY@[12; 13) "{"
WHITESPACE@[13; 15) "\n\n"
R_CURLY@[15; 16) "}"
I suppose if "get nth non-token node" was a cheap operation the layout may be changed to support using that, but currently nth non-token-node doesn't mean anything because optional nodes are just not present when not present. Also, for losslessness, it adds needing to keep track of the interleaving of the two arrays.
I still have yet to figure out the exact cost of doing a variable length encoding scheme.
Having a VLA and VLE seems like a big win IF we also make it a thin pointer... which requires inlining an Arc implementation... which is... scary... but I'm going to try it out because who doesn't like a hard problem.
GreenNode { // align(u64) size(dyn)
head: { // align(u64) size(u64)
text_len: TextUnit, // u32
kind: SyntaxKind, // u16
child_count: u16,
}
children: [u64],
// 0u64 ArcGreenNode // (u64, usize, (u32 padding if usize=u32))
// 1u64 GreenToken // (u64, SyntaxKind, (u48 padding), SmolStr)
// NB: cannot compress this due to requirement of &GreenToken
}
That’s not related to the PR really, but let me write down this idea...
Looks like the problem we have is that tokens are large. And the reason for that mainly is that SmolStr is 3 pointers large. Should we replace it .... with interning?
I am super against rustc style interning, which uses thread locals and index: I really want the syntax tree to be a value type with, khm, no strings attached. That was the primary motivation for small str.
But what if we do file-local interning using Arc
Sorry for super-confusing wording: this is me failing spectacularly to get a good nights rest befor tomorrow flight :-)
A tree-local deduped boxing of tokens definitely would allow the array to be much more regular. And it would eliminate a lot of the trickiness required in #33 which I personally think has crossed the line into "too clever".
I considered the idea of boxing the tokens earlier and just overlooked the ability to dedupe them. That approach is definitely worth trying out, so I'll play with it over the weekend.
Closing in favor of #34.
Supersedes #31 by doing it for real this time. In addition, the laundry list of assumptions have been reduced to just alignment of
GreenNode
and offset ofGreenNode
fields, which are tested but not checked at runtime. The one unwritten assumption is that there is no padding afterGreenNode.children
, which I think is indeed guaranteed by#[repr(C)]
.I still expect this to be neutral or negative on performance.
GreenNode
used to be3xusize
and nowArc<GreenNode>
is2xusize
, butGreenElement
is still5xusize
due to storing the4xusize
largeGreenToken
inline. The internalcursor::NodeData
remains at5xusize
large as well asSyntaxNode
,SyntaxToken
, andSyntaxElement
remaining unchanged.Realistically, this removes one pointer chase in
NodeData
to get to the green node's children at the cost of some tricky unsafe allocation. It also adds a pointer chase inGreenElement
to get to aGreenNode
's kind or len.