Open overlookmotel opened 6 months ago
Maybe we can take inspiration from @babel/template
Yes, that's exactly what got me thinking about this, but I think the rusty equivalent would be a macro. Babel parses the template at runtime, whereas I think we'd want to do it at compile time, or it'd be a big runtime slow-down.
Yes, that's exactly what got me thinking about this, but I think the rusty equivalent would be a macro. Babel parses the template at runtime, whereas I think we'd want to do it at compile time, or it'd be a big runtime slow-down.
Yes, I also think we should do it at compile-time.
Just to say, I imagine that implementing the macro would not be too difficult. But what will be really hard is to avoid it blowing up compile times. We ideally want to expand the macros once in a build script and cache the result, so you only pay the macro compilation cost once, not every time the crate is compiled. But this is a hard problem.
Need some kind of templating to reduce boilerplate and speed up development.
Me and @overlookmotel talked a bit about this, It is still really early in the discussion and needs further R&D. The gist of our plan is what he stated here:
We ideally want to expand the macros once in a build script and cache the result, so you only pay the macro compilation cost once, not every time the crate is compiled. But this is a hard problem.
We basically want to have a macro cache, But there are many ways to do it. For caching we need 2 different macro implementations, One would get executed if the cache doesn't exist and generates a binary file, And the other one would just output the result.
I can see a few different approaches.
It is the most natural thing to have, Something like a #[pure_macro]
attribute that would cache and output the result based on the input of the macro, But It is too much work so it's not a real viable option at this point.
We just remove the intermediate step of using the builder, It parses and generates the appropriate builder calls based on the parsed tree.
2
or pay the cost at the runtime)I find this one the more convenient and probably the fastest, We already have most of our AST as repr(C) and we plan to do this for all of them anyway. We just use the parser to parse the AST and cache the value, In the cold run it generates the binary, And then all we have to do is use an include macro to embed and cast the binary in place of builder calls.
In all of these, we need to keep the ast!
macro out of the actual oxc_ast
and oxc_parser
crates.
Can we think outside of the box and don't go down the macros rabbit hole in the first place 🙃
I'm really interested in knowing what you have in mind Because I'm not sure how we can do this without parsing it at the runtime or transforming the JS into builder calls.
The only other option that I can think of is having more methods in our ast_builder
which isn't exactly the same thing as this but It could satisfy the Rolldown and other end users' needs.
Me, @Boshen, @rzvxa and @Dunqing discussed this today on a video call.
We noted that SWC does something similar to the ast!
macro mentioned above. Their version is:
let stmt = quote!{
"let $name = true;" as Stmt,
name = BindingIdentifier { span: SPAN, name: "Foo" }
};
We decided that we'd prefer to have the template insertions outside of the macro like this:
let name = BindingIdentifier { span: SPAN, name: "Foo" };
let stmt = ast_stmt!{ let #name = 123; };
We agreed that:
ast!
macro approach (compared to the other options of AST snippets, or adding loads of complicated methods to AST builder).Statement::VariableDeclaration( /* ...etc... */ )
.I believe that's an accurate representation of our discussion. If not, please let me know!
Note1: #name
is a private identifier. We'll let the implementor to decide what marker to use.
Note2: oxc may end up not using the macros itself due to compile time regressions, but we'll export the macros for downstream crates to use.
macro for quick prototype, and some other form to paste in the expand code for usage.
Note1:
#name
is a private identifier.
Good point. Is there another "special" character we can use which is unambiguous? %name
? @@name
?
Maybe we can use a ruby-like interpolation:
ast!(let #{name} = 123;)
It won't conflict with any valid js code, And it can't be mistaken with `${name}`
, We can also use the same syntax with parentheses for expanding iterators.
ast!(let arr = [ #(#{elements}),* ])
Both #(ident)
and #{ident}
are syntax errors as of now so maybe we can leverage them to our advantage But this proposal(stage 1) can cause problems in the future if it gets in.
https://tc39.es/proposal-record-tuple/#sec-ecmascript-language-lexical-grammar
With that said IMHO, The odds of it getting in with this syntax is slim. TC39 nowadays is trying to be closer to the typescript and TS folks would prefer a C# like syntax with Tuple/Record
; So they probably prefer this:
let a = Tuple(1, 2, 3);
Over this:
let a = #[1, 2, 3];
But it's just a wild guess.
I like the Ruby-style interpolation syntax. It allows more flexibility than quote!
's interpolation which can only accept an identifier $foo
, whereas with this style we can do #{foo.bar}
, #{self.foo.bar(qux)}
etc.
Shall we go with #(...)
just to avoid potential clash with Tuple syntax?
Sequence expansion could use #(..#(elements)),*
or ##(#(elements)),*
(or something else, many possibilities).
I'm not dead set on this - #{...}
works for now.
The reason I chose this is because of its familiarity, It looks like ${expr}
to the people, they have to use #
over $
.
But going with parenthesis might be worth it to avoid any possible clash. It is such a minor change that we can discuss it at the last minute and still be fine.
Actually, ${expr}
would not be legal JS syntax either, would it? That's an identifier followed by an object literal, which is a syntax error, I think.
Actually,
${expr}
would not be legal JS syntax either, would it? That's an identifier followed by an object literal, which is a syntax error, I think.
It can be valid in the context of class/enum declaration:
class ${ /* stuff */ }
class ${ /* stuff */ }
Ha! Yes, of course you're right.
Currently, I think we can utilize macros (macro_rules!-like
) to parse the code for babel helper-like scenarios. The shared allocator can be utilized in parsing, so I do not anticipate a significant trade-off, particularly when the target code appears to be multiple lines, for instance, some polyfill-like code in CommonJS, async/await, etc, for the time being.
Having thought about it more, the problem with an interpolation syntax is that we need to add support for it to the parser.
If we stick with valid (but unusual) syntax, we can avoid that problem. e.g.:
let expr = get_expression_somehow();
let stmt = ast!{ Statement: const foo = $expr; };
The remaining problem is how to deal with macro hygiene. $expr
needs to be replaced in output with expr
, where expr
resolves to expr
outside the macro. Could look at quote!
macro to see how it achieves this.
And then how to do the parsing at build-time (likeoxc_ast_tools
), instead of the compile-time impact of parsing in a proc macro.
Another question, how to set a correct Span?
This is the code we have to write in transformer to generate
let Foo;
:Seems a bit bananas! Very long to write, and non-trivial to understand at first glance what the code is doing.
Surely we can find some way to reduce this.
Like:
ast!{ let #name; }
The obvious problem is compile time impact of macros. Can we find some way to precompile macro expansion so we can enjoy a simpler code style without paying the compile-time price?