rust-analyzer / rowan

Apache License 2.0
689 stars 57 forks source link

Expose sorbus as green tree impl #69

Closed CAD97 closed 3 years ago

CAD97 commented 4 years ago

Because I can't help myself, apparently. Currently requires nightly and a local version of sorbus.

This is a version of the sorbus patch that just directly exposes sorbus as the green tree implementation, rather than trying to encapsulate it in #58. That explains the much larger line diff.

Again, the main difference is the change from using GreenNode/&GreenNode to using Arc<GreenNode>/ArcBorrow<'_, GreenNode>.

In order to support the strict separation required by the orphan rules, SyntaxElement is now its own enum rather than a typedef around NodeOrToken. This can, of course, be avoided if we just inline sorbus rather than using it as a separate crate (but I find the separation useful, tbh).

I think the "expose sorbus" approach is better than the "encapsulate sorbus" approach taken in #58, because the encapsulation is a lot of requisite unsafe type casting for little specific gain.

Does not add a SyntaxPtr type yet like the one provided in #58, but it can be implemented equivalently.

CAD97 commented 4 years ago
  • SyntaxPtr

This is one of the primary benefits brought by using sorbus. Sorbus stores the offset of each child from its parent (in the parent's children array), so finding the child covering an offset is a binary search O(log w) rather than O(w). This means going from an offset to the node at that offset can be O(d log w) rather than O(d w) (depth d, node width w).

If you wanted to phrase these bounds in just n, then it'd probably (probabilistically) be current O(log n × log n) and sorbus O(log n × log log n). Here's a WolframAlpha expression to compare the behavior of these functions: Plot[{Log[x], Log[x] Log[x], Log[x] Log[Log[x]]}, {x, 1, 1000}]

  • perf

Needs measuring. Notable theoretical improvements that sorbus provides include:

lazy parsing

I have a plan for that, using arc-swap. I just need to make sure that the design won't end up breaking all of the places we give out pointers with "interesting" ownership semantics. If we do want/need to add a third node type as a "thunk", sorbus is set up better to accommodate it than rowan's current design.

sorbus/ser

I even have a sorbus/deserialization feature with proper deduplication of nodes and full support of compact binary formats as well! It was fun learning how to use serde's DeserializeSeed interface.


I think at this point it's probably about time to move cad97/sorbus to rust-analyzer/sorbus. (I've invited you to cad97/sorbus so you can do as much.) If you want to set up a sync time to go over the design and internals of sorbus, please do ping me on Zulip.

CAD97 commented 3 years ago

Closing as (heavily) outdated. I'm still somewhat interested in exploring the design space some more, but it's not important.