oxc-project / backlog

backlog for collborators only
1 stars 0 forks source link

Separate AST types for with and without TypeScript? #100

Open overlookmotel opened 1 month ago

overlookmotel commented 1 month ago

Many AST types contain multiple fields for TS-related data e.g. Function:

pub struct Function<'a> {
    pub r#type: FunctionType,
    pub span: Span,
    pub id: Option<BindingIdentifier<'a>>,
    pub generator: bool,
    pub r#async: bool,
    // TS only
    pub declare: bool,
    // TS only
    pub type_parameters: Option<Box<'a, TSTypeParameterDeclaration<'a>>>,
    // TS only
    pub this_param: Option<TSThisParameter<'a>>,
    pub params: Box<'a, FormalParameters<'a>>,
    // TS only
    pub return_type: Option<Box<'a, TSTypeAnnotation<'a>>>,
    pub body: Option<Box<'a, FunctionBody<'a>>>,
    pub scope_id: Cell<Option<ScopeId>>,
}

Even with all these fields boxed, so they're 8 bytes each, that's still 24 bytes out of 104 that are TS-specific.

Many other AST types similarly have various TS-specific fields.

The problems with this

  1. Unnecessarily larger memory use when the source code is JS (the majority of code that Oxc will be used on, as when e.g. bundling majority of code is from node_modules).
  2. Many more fields to visit when using Visit / VisitMut / Traverse - slower.

Possible solutions

1. Separate types

We could have 2 separate types Function and FunctionWithTS.

enum Expression {
    FunctionExpression(Box<'a, Function<'a>>),
    FunctionExpressionWithTS(Box<'a, FunctionWithTS<'a>>),
    // etc...
}

As the AST is now #[repr(C)] and we control type memory layouts, we could make Function and FunctionWithTS have identical layouts for the JS fields, and have the TS fields at end of FunctionWithTS. Then when stripping types in transformer, to turn a FunctionWithTS into a plain Function, all you have to do is change the discriminant of the Expression (1 byte write).

2. Store TS fields in a separate struct

struct FunctionTS {
    pub declare: bool,
    pub type_parameters: Option<Box<'a, TSTypeParameterDeclaration<'a>>>,
    pub this_param: Option<TSThisParameter<'a>>,
    pub params: Box<'a, FormalParameters<'a>>,
}

Then the TS-related fields in Function could be replaced with just 1 field ts: Option<Box<FunctionTS>> (8 bytes).

3. Store TS fields in side array

Store TS-related fields in a side array indexed by AstNodeId.

Last thoughts

Maybe this is not a good idea. It would be more performant, but perhaps that's outweighed by ergonomics.

Anyway, this is not important or urgent. Just writing it up now as it occurred to me.

Related to #34.