AST template macros - Githubissues

overlookmotel commented 3 months ago

This is the code we have to write in transformer to generate let Foo;:

let kind = VariableDeclarationKind::Let;
let declarations = {
    let ident = BindingIdentifier::new(SPAN, name.clone());
    let pattern_kind = self.ctx.ast.binding_pattern_identifier(ident);
    let binding = self.ctx.ast.binding_pattern(pattern_kind, None, false);
    let decl = self.ctx.ast.variable_declarator(SPAN, kind, binding, None, false);
    self.ctx.ast.new_vec_single(decl)
};
Declaration::VariableDeclaration(self.ctx.ast.variable_declaration(
    SPAN,
    kind,
    declarations,
    Modifiers::empty(),
))

Seems a bit bananas! Very long to write, and non-trivial to understand at first glance what the code is doing.

Surely we can find some way to reduce this.

Like: ast!{ let #name; }

The obvious problem is compile time impact of macros. Can we find some way to precompile macro expansion so we can enjoy a simpler code style without paying the compile-time price?

Dunqing commented 3 months ago

Maybe we can take inspiration from @babel/template

overlookmotel commented 3 months ago

Yes, that's exactly what got me thinking about this, but I think the rusty equivalent would be a macro. Babel parses the template at runtime, whereas I think we'd want to do it at compile time, or it'd be a big runtime slow-down.

Dunqing commented 3 months ago

Yes, that's exactly what got me thinking about this, but I think the rusty equivalent would be a macro. Babel parses the template at runtime, whereas I think we'd want to do it at compile time, or it'd be a big runtime slow-down.

Yes, I also think we should do it at compile-time.

overlookmotel commented 3 months ago

Just to say, I imagine that implementing the macro would not be too difficult. But what will be really hard is to avoid it blowing up compile times. We ideally want to expand the macros once in a build script and cache the result, so you only pay the macro compilation cost once, not every time the crate is compiled. But this is a hard problem.

Boshen commented 2 months ago

Need some kind of templating to reduce boilerplate and speed up development.

Requested by the Rolldown team.

rzvxa commented 2 months ago

Me and @overlookmotel talked a bit about this, It is still really early in the discussion and needs further R&D. The gist of our plan is what he stated here:

We ideally want to expand the macros once in a build script and cache the result, so you only pay the macro compilation cost once, not every time the crate is compiled. But this is a hard problem.

We basically want to have a macro cache, But there are many ways to do it. For caching we need 2 different macro implementations, One would get executed if the cache doesn't exist and generates a binary file, And the other one would just output the result.

I can see a few different approaches.

1. Cache the I/O

It is the most natural thing to have, Something like a #[pure_macro] attribute that would cache and output the result based on the input of the macro, But It is too much work so it's not a real viable option at this point.

2. Cache the builder instructions

We just remove the intermediate step of using the builder, It parses and generates the appropriate builder calls based on the parsed tree.

3. Cache the AST output (dynamic evaluations still need to use option `2` or pay the cost at the runtime)

I find this one the more convenient and probably the fastest, We already have most of our AST as repr(C) and we plan to do this for all of them anyway. We just use the parser to parse the AST and cache the value, In the cold run it generates the binary, And then all we have to do is use an include macro to embed and cast the binary in place of builder calls.

In all of these, we need to keep the ast! macro out of the actual oxc_ast and oxc_parser crates.

Cons

We need more infrastructure to invalidate our cache repository.
More maintenance cost on our part

Boshen commented 2 months ago

Can we think outside of the box and don't go down the macros rabbit hole in the first place 🙃

rzvxa commented 2 months ago

I'm really interested in knowing what you have in mind Because I'm not sure how we can do this without parsing it at the runtime or transforming the JS into builder calls.

The only other option that I can think of is having more methods in our ast_builder which isn't exactly the same thing as this but It could satisfy the Rolldown and other end users' needs.

overlookmotel commented 2 months ago

Me, @Boshen, @rzvxa and @Dunqing discussed this today on a video call.

How SWC does this

We noted that SWC does something similar to the ast! macro mentioned above. Their version is:

let stmt = quote!{
    "let $name = true;" as Stmt,
    name = BindingIdentifier { span: SPAN, name: "Foo" }
};

We decided that we'd prefer to have the template insertions outside of the macro like this:

let name = BindingIdentifier { span: SPAN, name: "Foo" };
let stmt = ast_stmt!{ let #name = 123; };

Our plan

We agreed that:

Ideal for the user is the ast! macro approach (compared to the other options of AST snippets, or adding loads of complicated methods to AST builder).
We should try to do that.
Parsing the JS code fragment inside the macro will happen at compile time.
Macro will expand to Statement::VariableDeclaration( /* ...etc... */ ).
Initial implementation will hurt compile times due to the macros.
In a 2nd phase, we will improve compile time performance using some form of caching (as per rzvxa's comment above).
rzvxa is going to take the lead on this.

I believe that's an accurate representation of our discussion. If not, please let me know!

Boshen commented 2 months ago

Note1: #name is a private identifier. We'll let the implementor to decide what marker to use.

Note2: oxc may end up not using the macros itself due to compile time regressions, but we'll export the macros for downstream crates to use.

macro for quick prototype, and some other form to paste in the expand code for usage.

overlookmotel commented 2 months ago

Note1: #name is a private identifier.

Good point. Is there another "special" character we can use which is unambiguous? %name? @@name?

rzvxa commented 2 months ago

Maybe we can use a ruby-like interpolation:

ast!(let #{name} = 123;)

It won't conflict with any valid js code, And it can't be mistaken with `${name}`, We can also use the same syntax with parentheses for expanding iterators.

ast!(let arr = [ #(#{elements}),* ])

Both #(ident) and #{ident} are syntax errors as of now so maybe we can leverage them to our advantage But this proposal(stage 1) can cause problems in the future if it gets in. https://tc39.es/proposal-record-tuple/#sec-ecmascript-language-lexical-grammar

With that said IMHO, The odds of it getting in with this syntax is slim. TC39 nowadays is trying to be closer to the typescript and TS folks would prefer a C# like syntax with Tuple/Record; So they probably prefer this:

let a = Tuple(1, 2, 3);

Over this:

let a = #[1, 2, 3];

But it's just a wild guess.

overlookmotel commented 1 month ago

I like the Ruby-style interpolation syntax. It allows more flexibility than quote!'s interpolation which can only accept an identifier $foo, whereas with this style we can do #{foo.bar}, #{self.foo.bar(qux)} etc.

Shall we go with #(...) just to avoid potential clash with Tuple syntax?

Sequence expansion could use #(..#(elements)),* or ##(#(elements)),* (or something else, many possibilities).

I'm not dead set on this - #{...} works for now.

rzvxa commented 1 month ago

The reason I chose this is because of its familiarity, It looks like ${expr} to the people, they have to use # over $. But going with parenthesis might be worth it to avoid any possible clash. It is such a minor change that we can discuss it at the last minute and still be fine.

overlookmotel commented 1 month ago

Actually, ${expr} would not be legal JS syntax either, would it? That's an identifier followed by an object literal, which is a syntax error, I think.

rzvxa commented 1 month ago

Actually, ${expr} would not be legal JS syntax either, would it? That's an identifier followed by an object literal, which is a syntax error, I think.

It can be valid in the context of class/enum declaration:

class ${ /* stuff */ }

overlookmotel commented 1 month ago

class ${ /* stuff */ }

Ha! Yes, of course you're right.

hyf0 commented 1 month ago

https://x.com/_kermanx_/status/1817597119152439476

oxc-project / oxc

AST template macros #4112

1. Cache the I/O

2. Cache the builder instructions

3. Cache the AST output (dynamic evaluations still need to use option `2` or pay the cost at the runtime)

Cons

How SWC does this

Our plan

oxc-project / oxc

AST template macros #4112

1. Cache the I/O

2. Cache the builder instructions

3. Cache the AST output (dynamic evaluations still need to use option 2 or pay the cost at the runtime)

Cons

How SWC does this

Our plan

3. Cache the AST output (dynamic evaluations still need to use option `2` or pay the cost at the runtime)