oxc-project / oxc

⚓ A collection of JavaScript tools written in Rust.
https://oxc.rs
MIT License
12.07k stars 436 forks source link

ast: get arena-lifetime references to AST children #6873

Open DonIsaac opened 1 day ago

DonIsaac commented 1 day ago

Problem

Box<'a, T> and some AST nodes provide ways of references to their contents with the same lifetime as &self (e.g.Box<'a, T> -> &T), but do not provide a way to borrow for the lifetime of an allocation (e.g.Box<'a, T> -> &'a T). This introduces problems when collecting references to nodes in visitors.

This is mostly caused by lifetime restrictions introduced with AsRef<T>, whose contract returns a reference to T for the lifetime of &self.

Proposed Solution

Provide an AsArenaRef<T> trait for read-only borrows on arena-allocated data for the lifetime of the arena. To avoid borrow-checker violations, AsArenaRef would also need to borrow &self for the same lifetime.

pub trait AsArenaRef<'a, T: ?Sized> {
  fn as_arena_ref(&'a self) -> &'a T;
} 

Other Considerations

  1. AsArenaRef could be used to auto-implement GetAddress, but I haven't found a way to get E0207 (the type parameter U is not constrained by the impl trait, self type, or predicates)
    impl<'alloc, T, U> GetAddress for T where T: AsArenaRef<'alloc, U> {
    fn address(&self) -> crate::Address {
        Address(std::ptr::addr_of!(*self.as_arena_ref()))
    }
    }
  2. AsArenaRef could be used to auto-implement AsRef
overlookmotel commented 1 day ago

I can't quite get my head around this, but I don't think any new API should be required for the example you give.

In that case, you don't need to borrow nodes for the lifetime of the arena, only for a much shorter lifetime - the life of the spreads Vec which goes out of scope at the end of run. I can't see why that's a problem, as the original &AstNode<'a> also lives for the duration of run.

If we can avoid introducing this trait, I think it'd be ideal as it's potentially a foot-gun. It encourages borrowing nodes for longer than you need to, and "locking" the AST in a read-only state, which will cause problems if you later want to (quite legitimately) mutate the AST.

There is a similar pattern used in Visit which has always struck me as a bit fishy:

https://github.com/oxc-project/oxc/blob/b075982eaa6d11838a676d9129703f4d1070d7e8/crates/oxc_ast/src/generated/visit.rs#L42-L47

I don't actually think we should need an &'a T there either - a shorter lifetime should suffice. And in that case the extended lifetime can probably be used to break the aliasing rules (UB).

Hmm... I'd suggest as a first step, try playing with the lifetimes in #6751. It should be possible to make this work without unsafe code, and without borrowing for 'a (though I'll admit it's a bit of a brain-bender).

overlookmotel commented 1 day ago

I think the root problem in both cases may be AstKind. It should have 2 lifetimes:

pub enum AstKind<'a, 'b> {
    BooleanLiteral(&'b BooleanLiteral),
    NullLiteral(&'b NullLiteral),
    NumericLiteral(&'b NumericLiteral<'a>),
    /* ... */
}

'a is the lifetime of the AST. 'b is the lifetime of the borrowing of the AST. They shouldn't be conflated the way they are at present:

pub enum AstKind<'a> {
    BooleanLiteral(&'a BooleanLiteral),
    NullLiteral(&'a NullLiteral),
    NumericLiteral(&'a NumericLiteral<'a>),
    /* ... */
}