oxc-project / oxc

⚓ A collection of JavaScript tools written in Rust.
https://oxc.rs
MIT License
11.49k stars 417 forks source link

Move `TransformCtx` into `TraverseCtx` #6171

Open overlookmotel opened 1 week ago

overlookmotel commented 1 week ago

The problem

Currently we have TransformCtx which contains shared state between all transforms. It's stored in various transforms' self as a reference &'ctx TransformCtx<'a>.

Because transforms only have an immutable reference to TransformCtx, all mutable state has to be wrapped in a RefCell. e.g. the shared stack VarDeclarations introduced in #6170. RefCell is somewhat costly.

Proposed solution

Instead, move TransformCtx to live within TraverseCtx. TraverseCtx is a singleton and is passed to every visitor method in all transforms as a mutable &mut TraverseCtx<'a>. So there only needs to be one instance of TransformCtx, and we can get rid of the 'ctx lifetime too.

But we want to avoid leaking the transformer's internals into Traverse, as Traverse is meant to be a tool which can be used for other purposes too (e.g. minifier).

How?

I propose that we parameterize TraverseCtx so it can hold arbitrary state:

pub trait Traverse<'a, State> {
    fn enter_statements(&mut self, stmts: &mut Vec<'a, Statement<'a>>, ctx: &mut TraverseCtx<'a, State>) {}
    fn exit_statements(&mut self, stmts: &mut Vec<'a, Statement<'a>>, ctx: &mut TraverseCtx<'a, State>) {}
    // etc...
}

pub struct TraverseCtx<'a, State> {
    pub ancestry: TraverseAncestry<'a>,
    pub scoping: TraverseScoping,
    pub ast: AstBuilder<'a>,
    pub state: State,
}

// The new name for `TransformCtx`
struct TransformState<'a> {
    errors: Vec<OxcDiagnostic>,
    pub trivias: Trivias,
    var_declarations: SparseStack<Vec<'a, VariableDeclarator<'a>>>,
    // ... etc ...
}

type TransCtx<'a> = TraverseCtx<'a, TransformState<'a>>;

struct Common;

impl<'a> Traverse<'a, TransformState<'a>> for Common {
    fn enter_statements(&mut self, _stmts: &mut Vec<'a, Statement<'a>>, ctx: &mut TransCtx<'a>) {
        ctx.state.var_declarations.push(None);
    }

    fn exit_statements(&mut self, stmts: &mut Vec<'a, Statement<'a>>, ctx: &mut TransCtx<'a>) {
        if let Some(declarations) = ctx.state.var_declarations.pop() {
            // Insert `var` declaration statement
        }
    }
}

Preconditions

We currently store AstBuilder in both TransformCtx and TraverseCtx. We will need switch all uses of the copy in TransformCtx over to TraverseCtx i.e. self.ctx.ast.null_literal(SPAN) -> ctx.ast.null_literal(SPAN).

We can then remove AstBuilder from TransformCtx.

We will need to do this anyway once we have NodeIds on all AST nodes, and need AstBuilder to be stateful to maintain a counter for next Node ID - which requires there to be only 1 copy of AstBuilder in the transformer.

Boshen commented 1 week ago

The minifier is using TraverseCtx as well.

overlookmotel commented 1 week ago

Just to expand on what Boshen said: The problem he is pointing out is that the generic State would mean compiler has to build 2 separate copies of Traverse for the transformer and minifier (as they would have different States). All methods of Traverse and also all the walk_* functions take a TraverseCtx.

Traverse is not so big. walk.rs is 5600 lines, but it actually boils down to not that much. A lot of it is verbose Rust code that is reduced to only a few assembly operations. e.g. &mut *((node as *mut u8).add(ancestor::OFFSET_PROGRAM_HASHBANG) as *mut Option<Hashbang>) boils down to just ptr + 16. So maybe having 2 copies of it is not so bad.

But another approach would be to try to replace the RefCells in transformer with GhostCells. I'll look into whether can make that work before resorting to generics.

overlookmotel commented 2 days ago

6246 demonstrated replacing RefCells with UnsafeCells delivered no significant speed-up. So switching to GhostCell would be similarly un-impactful.

I think it's worthwhile trying the generic State approach, and seeing what impact it has on both runtime and compile time.