Use sorbus internally as the green tree implementation

CAD97 commented 4 years ago

This rejigs rowan's green tree implementation to use sorbus's.

This implementation completely encapsulates sorbus. (That's part of the complexity of the patch, as we need to convert between rowan::GreenNode and sorbus::green::Node on the heap, copyless.) This is due in part to an initial attempt not doing this showing multiple impls rowan does that would be incoherent if it were to just reexport the sorbus types as appropriate.

rc-borrow, however, is an added public dependency. This is because while rowan previously smashed the green node and token handles together by ensuring they never overlap, and then handed out effectively &Arc<ActualNode>, sorbus's node type is actually the type on the heap, and the owned handle is a normal Arc<Node>.

Basically, what was previously GreenNode is now Arc<GreenNode>, and what was previously &GreenNode is now ArcBorrow<'_, GreenNode>.

I verified that the tests and both examples run clean under miri (after increasing the recursion limit).

Draft until at the very least sorbus is published to crates.

CAD97 commented 4 years ago

This now exposes the "child at offset" functionality required to properly solve https://github.com/rust-analyzer/rust-analyzer/issues/3934.

matklad commented 4 years ago

I am both excited to see this and a bit terrified review-time-wise :)

CAD97 commented 4 years ago

@matklad if you want to set up a sync time to just walk through sorbus and/or rowan's use of sorbus, I'm pretty much available anytime at the moment. (Probably best for me is afternoon/evening US Eastern, though.) Just ping me on Zulip and I'll probably see it.

CAD97 commented 4 years ago

experimental r-a patch: https://github.com/CAD97/rust-analyzer/pull/2

matklad commented 4 years ago

experimental r-a patch:

What about perf-memory difference? It's probably to measure those using only parser/symbols, and not full-fledged analysis stats (thoguh global picture wouldn't hurt).

CAD97 commented 4 years ago

What about perf-memory difference?

I'm not sure how to test this (or if I even can on Windows).

I can draw diagrams for the tree layout, though, to estimate the memory usage difference. I'm only considering 64 bit targets here, for simplicity.

Current:

  Token                    TokenData
+----------------+       +--------------------+
| pointer |  0b1 |------>| ArcInner (2×usize) |
+----------------+       | SyntaxKind (u16)   |
    (arc style)          | SmolStr (3×usize)  |
                         +--------------------+

  Node                     NodeData
+----------------+       +--------------------+
| pointer & !0b1 |------>| ArcInner (2×usize) |
+----------------+       | N (usize)          |
    (arc style)          | SyntaxKind (u16)   |
                         | TextSize (u32)     |
                         | [PackedElement; N] |
                         +--------------------+

  PackedElement = union { Token, Node };

Each unique token: 48 bytes†.
Each unique node: 32 bytes, plus 8 bytes for each child edge.

† each unique token text that is not whitespace and >22 bytes requires an additional Arc<str>.

Sorbus:

  Arc<Token>             Token
+--------------+       +--------------------+
| thin pointer |------>| ArcInner (2×usize) |
+--------------+       | N (u32)            |
                       | SyntaxKind (u16)   |
                       | text: [u8; N]      |
                       +--------------------+

  Arc<Node>              Node
+--------------+       +--------------------+
| thin pointer |------>| ArcInner (2×usize) |
+--------------+       | N (u16)            |
                       | SyntaxKind (u16)   |
                       | TextSize (u32)     |
                       | [PackedChild; N]   |
                       +--------------------+

  PackedChild = union {
      (TextSize, PackedElement),
      (PackedElement, TextSize),
  };  (align u32, size 3×u32)

  PackedElement = union {
      Arc<Token> | 0b1,
      Arc<Node> & !0b1,
  };

Each unique token: 22 bytes, plus the size of the token text (rounded up to the nearest 2 mod 8).
Each unique node: 24 bytes, plus 12 bytes for each child edge† (rounded up to the nearest 0 mod 8).

† each child edge is carrying an additional 4 bytes of information over the current tree.

It's hard to say the exact expected memory usage just by looking at this, however:

A non-whitespace token with length that fits in a SmolStr will fit in the same space (or smaller).
A non-whitespace token with length that does not fit in a SmolStr will still only use a single allocation and allocate less space.
A whitespace token >26 bytes will take up more space (but still be deduplicated).
A node has a constant size decrease of 8 bytes.
Each node child edge takes 50% more space (but this is due to storing more information).

The key advantage, then, is probably the more principled design, rather than any specific space savings; I'd guess space to be about the same or slightly higher (but less higher than the binsearch branch).

And that more principled design means handing out actual &Node rather than &Arc<Node>, saving an indirection. So this probably will be felt more in performance of syntax tree APIs than actual memory usage, so I guess it does need someone who knows how to test it to test it.

Even if sorbus's approach turns out to be a bust, and the ArcBorrow usage doesn't turn out to be worth anything, I think I do want to get around to reimplementing the current implementation with pointer-utils/erasable/slice-dst rather than thin-dst, just because they provide a more elegant abstraction layer.

CAD97 commented 4 years ago

I'm going to close this.

I'll reopen again once Rust 1.46 is out and sorbus is on stable again. Alloc-free cache hits is a big deal that I saw fit to go ahead and merge into sorbus to prevent the impl there from bitrotting.

Probably when 1.46 hits beta I'll start back in on experiments. I'd like to be able to show that sorbus at the very least provides the same level of performance and memory usage patterns as current rowan, given that it's actually storing more information in the tree so that offset -> Node is a sublinear thing instead of a superlinear thing. (Current estimate: O(log² n) vs O(n log n), or more specifically O(d log w) vs O(d w) for d = (avg) node depth, w = (avg) node width.)

I'd also like to present two approaches: encapsulating sorbus and re-exporting sorbus. This PR took the approach of completely hiding sorbus from the outside world, but it would be interesting to also explore just re-exporting sorbus's types at the proper places. This PR took the encapsulated approach in part due to coherence rules, so that will again be an interesting experience, and probably inform whether sorbus stays green-only and Rowan is the syntax layer (cool!) or if sorbus needs to grow some syntax layer support of its own.

Once Rowan is using sorbus, I've got a concept for a single unified syntax tree (rather than the cursor/API split that currently exists) that I want to experiment with as well.

Anyway, that's my plans around Rowan/sorbus. And eventually I need to get back to the blog post series I was working on before getting sidetracked improving the tools it's built on...

rust-analyzer / rowan

Use sorbus internally as the green tree implementation #58