rust-analyzer / rowan

Apache License 2.0
697 stars 57 forks source link

Pack GreenElement into 1×usize PackedGreenElement #43

Closed CAD97 closed 4 years ago

CAD97 commented 4 years ago

r? @matklad

It is required here to tag all GreenToken to have a low bit of 1 so that we can go from &PackedGreenElement to GreenElementRef.

matklad commented 4 years ago

It is required here to mangle all GreenToken to have a low bit of 1 so that we can go from &PackedGreenElement to GreenElementRef.

Oh, I see... Hm, was I wrong insisting on that we should hide Arc behind a wrapper? Naively, one would expect that iteration just yields pointers to nodes, without double indirection.... But I guess that just doesn't work out with Rust type system. Like, if iteration yields an &NodeData, you should be able to "clone" it into an Arc<NodeData> (using shared from this trick), but that doesn't work out type-wise, because clone can only remove &.

I think the current approach is the best one to start with, but maybe we should think about making it more straightforward on the machine level (no double pointers), without forking the types on the Rust level. What bites us is that Clone works for &T -> T and T -> T, but we want to &T -> U and U -> U, which is a ToOwned, Clone pair, which is more awkward to use

matklad commented 4 years ago

Hm, OTOH, these are green nodes, so perhaps we don't have to pay too much attention to the Rust-level API

CAD97 commented 4 years ago

@matklad yeah, that's the foot: either you expose String/str separately, or everyone has to deal in &String. Note that cursor::NodeData doesn't have to point at GreenNode(Arc<Inner>), though; it can point directly to the inner data. So the indirection only exists when manipulating green nodes directly.

After we get this encapsulated version pushed through, I've an itch to scratch basically rebuilding rowan in an space-optimized-from-first-principles manner, so it'll be interesting to compare that experiment once it's done.

matklad commented 4 years ago

After we get this encapsulated version pushed through, I've an itch to scratch basically rebuilding rowan in an space-optimized-from-first-principles manner, so it'll be interesting to compare that experiment once it's done.

Yeah, I feel like, with thread-local SyntaxNode, and pointer-wide interned green elements, rowan is mostly done, design wise, so it makes sense to polish this approach, and maybe publish 1.0. The only problem is that I've though exactly that before :-)

One potential improvement we still can make is adding ability to parse things lazily. Basically, for things like functions, we can store bodies literally as String, and parse them into propper green nodes on the first access.

matklad commented 4 years ago

bors r+

On Thursday, 21 November 2019, Christopher Durham notifications@github.com wrote:

@CAD97 commented on this pull request.

In src/green/token.rs https://github.com/rust-analyzer/rowan/pull/43#discussion_r349109569:

}

+unsafe impl Send for GreenToken {} // where GreenTokenData: Send + Sync +unsafe impl Sync for GreenToken {} // where GreenTokenData: Send + Sync

No, GreenToken is public and GreenTokenData is private.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-analyzer/rowan/pull/43?email_source=notifications&email_token=AANB3M5YZZT62ENDNBXOJL3QU2KGXA5CNFSM4JPKRNW2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCMQMNKQ#discussion_r349109569, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANB3M23RWTZOYAOBP5SBSLQU2KGXANCNFSM4JPKRNWQ .

bors[bot] commented 4 years ago

Build succeeded

CAD97 commented 4 years ago

Dropping this here since I don't really have a better place to put it, but I had an idea for

One potential improvement we still can make is adding ability to parse things lazily. Basically, for things like functions, we can store bodies literally as String, and parse them into propper green nodes on the first access.

Adding another "type" of node to the tree is problematic because everything currently assumes two types: leaf (token) or non-leaf (node). So.... just make a delayed blob a token!

There are a couple ways to handle this:

I'm going to try out the third approach some in my experimental rewrite.

Note that this would, of course, requiring rewriting the green tree like any other mutation. arc-swap may be able to make that more tolerable?