oxc-project / oxc

⚓ A collection of JavaScript tools written in Rust.
https://oxc.rs
MIT License
12.58k stars 461 forks source link

AST template macros #4112

Open overlookmotel opened 6 months ago

overlookmotel commented 6 months ago

This is the code we have to write in transformer to generate let Foo;:

let kind = VariableDeclarationKind::Let;
let declarations = {
    let ident = BindingIdentifier::new(SPAN, name.clone());
    let pattern_kind = self.ctx.ast.binding_pattern_identifier(ident);
    let binding = self.ctx.ast.binding_pattern(pattern_kind, None, false);
    let decl = self.ctx.ast.variable_declarator(SPAN, kind, binding, None, false);
    self.ctx.ast.new_vec_single(decl)
};
Declaration::VariableDeclaration(self.ctx.ast.variable_declaration(
    SPAN,
    kind,
    declarations,
    Modifiers::empty(),
))

Seems a bit bananas! Very long to write, and non-trivial to understand at first glance what the code is doing.

Surely we can find some way to reduce this.

Like: ast!{ let #name; }

The obvious problem is compile time impact of macros. Can we find some way to precompile macro expansion so we can enjoy a simpler code style without paying the compile-time price?

Dunqing commented 6 months ago

Maybe we can take inspiration from @babel/template

overlookmotel commented 6 months ago

Yes, that's exactly what got me thinking about this, but I think the rusty equivalent would be a macro. Babel parses the template at runtime, whereas I think we'd want to do it at compile time, or it'd be a big runtime slow-down.

Dunqing commented 6 months ago

Yes, that's exactly what got me thinking about this, but I think the rusty equivalent would be a macro. Babel parses the template at runtime, whereas I think we'd want to do it at compile time, or it'd be a big runtime slow-down.

Yes, I also think we should do it at compile-time.

overlookmotel commented 6 months ago

Just to say, I imagine that implementing the macro would not be too difficult. But what will be really hard is to avoid it blowing up compile times. We ideally want to expand the macros once in a build script and cache the result, so you only pay the macro compilation cost once, not every time the crate is compiled. But this is a hard problem.

Boshen commented 5 months ago

Need some kind of templating to reduce boilerplate and speed up development.

rzvxa commented 5 months ago

Me and @overlookmotel talked a bit about this, It is still really early in the discussion and needs further R&D. The gist of our plan is what he stated here:

We ideally want to expand the macros once in a build script and cache the result, so you only pay the macro compilation cost once, not every time the crate is compiled. But this is a hard problem.

We basically want to have a macro cache, But there are many ways to do it. For caching we need 2 different macro implementations, One would get executed if the cache doesn't exist and generates a binary file, And the other one would just output the result.

I can see a few different approaches.

1. Cache the I/O

It is the most natural thing to have, Something like a #[pure_macro] attribute that would cache and output the result based on the input of the macro, But It is too much work so it's not a real viable option at this point.

2. Cache the builder instructions

We just remove the intermediate step of using the builder, It parses and generates the appropriate builder calls based on the parsed tree.

3. Cache the AST output (dynamic evaluations still need to use option 2 or pay the cost at the runtime)

I find this one the more convenient and probably the fastest, We already have most of our AST as repr(C) and we plan to do this for all of them anyway. We just use the parser to parse the AST and cache the value, In the cold run it generates the binary, And then all we have to do is use an include macro to embed and cast the binary in place of builder calls.

In all of these, we need to keep the ast! macro out of the actual oxc_ast and oxc_parser crates.

Cons

Boshen commented 5 months ago

Can we think outside of the box and don't go down the macros rabbit hole in the first place 🙃

rzvxa commented 5 months ago

I'm really interested in knowing what you have in mind Because I'm not sure how we can do this without parsing it at the runtime or transforming the JS into builder calls.

The only other option that I can think of is having more methods in our ast_builder which isn't exactly the same thing as this but It could satisfy the Rolldown and other end users' needs.

overlookmotel commented 4 months ago

Me, @Boshen, @rzvxa and @Dunqing discussed this today on a video call.

How SWC does this

We noted that SWC does something similar to the ast! macro mentioned above. Their version is:

let stmt = quote!{
    "let $name = true;" as Stmt,
    name = BindingIdentifier { span: SPAN, name: "Foo" }
};

We decided that we'd prefer to have the template insertions outside of the macro like this:

let name = BindingIdentifier { span: SPAN, name: "Foo" };
let stmt = ast_stmt!{ let #name = 123; };

Our plan

We agreed that:

I believe that's an accurate representation of our discussion. If not, please let me know!

Boshen commented 4 months ago

Note1: #name is a private identifier. We'll let the implementor to decide what marker to use.

Note2: oxc may end up not using the macros itself due to compile time regressions, but we'll export the macros for downstream crates to use.

macro for quick prototype, and some other form to paste in the expand code for usage.

overlookmotel commented 4 months ago

Note1: #name is a private identifier.

Good point. Is there another "special" character we can use which is unambiguous? %name? @@name?

rzvxa commented 4 months ago

Maybe we can use a ruby-like interpolation:

ast!(let #{name} = 123;)

It won't conflict with any valid js code, And it can't be mistaken with `${name}`, We can also use the same syntax with parentheses for expanding iterators.

ast!(let arr = [ #(#{elements}),* ])

Both #(ident) and #{ident} are syntax errors as of now so maybe we can leverage them to our advantage But this proposal(stage 1) can cause problems in the future if it gets in. https://tc39.es/proposal-record-tuple/#sec-ecmascript-language-lexical-grammar

With that said IMHO, The odds of it getting in with this syntax is slim. TC39 nowadays is trying to be closer to the typescript and TS folks would prefer a C# like syntax with Tuple/Record; So they probably prefer this:

let a = Tuple(1, 2, 3);

Over this:

let a = #[1, 2, 3];

But it's just a wild guess.

overlookmotel commented 4 months ago

I like the Ruby-style interpolation syntax. It allows more flexibility than quote!'s interpolation which can only accept an identifier $foo, whereas with this style we can do #{foo.bar}, #{self.foo.bar(qux)} etc.

Shall we go with #(...) just to avoid potential clash with Tuple syntax?

Sequence expansion could use #(..#(elements)),* or ##(#(elements)),* (or something else, many possibilities).

I'm not dead set on this - #{...} works for now.

rzvxa commented 4 months ago

The reason I chose this is because of its familiarity, It looks like ${expr} to the people, they have to use # over $. But going with parenthesis might be worth it to avoid any possible clash. It is such a minor change that we can discuss it at the last minute and still be fine.

overlookmotel commented 4 months ago

Actually, ${expr} would not be legal JS syntax either, would it? That's an identifier followed by an object literal, which is a syntax error, I think.

rzvxa commented 4 months ago

Actually, ${expr} would not be legal JS syntax either, would it? That's an identifier followed by an object literal, which is a syntax error, I think.

It can be valid in the context of class/enum declaration:

class ${ /* stuff */ }
overlookmotel commented 4 months ago
class ${ /* stuff */ }

Ha! Yes, of course you're right.

hyf0 commented 4 months ago

https://x.com/_kermanx_/status/1817597119152439476

7086cmd commented 2 months ago

Currently, I think we can utilize macros (macro_rules!-like) to parse the code for babel helper-like scenarios. The shared allocator can be utilized in parsing, so I do not anticipate a significant trade-off, particularly when the target code appears to be multiple lines, for instance, some polyfill-like code in CommonJS, async/await, etc, for the time being.

overlookmotel commented 1 month ago

Having thought about it more, the problem with an interpolation syntax is that we need to add support for it to the parser.

If we stick with valid (but unusual) syntax, we can avoid that problem. e.g.:

let expr = get_expression_somehow();
let stmt = ast!{ Statement: const foo = $expr; };

The remaining problem is how to deal with macro hygiene. $expr needs to be replaced in output with expr, where expr resolves to expr outside the macro. Could look at quote! macro to see how it achieves this.

And then how to do the parsing at build-time (likeoxc_ast_tools), instead of the compile-time impact of parsing in a proc macro.

Dunqing commented 1 month ago

Another question, how to set a correct Span?