Open HerringtonDarkholme opened 3 years ago
Seems deno author doesn't like swc interface any way. https://github.com/denoland/deno_ast
I'm thinking about a more aggressive approach like gogocode or more advanced semgrep. Basically I will take source code and feed it to tree-sitter. Tree-sitter will give me an untyped syntax tree against which I can match a target syntax, via the type-id in the syntax node.
This approach is quite versatile and generic. But it might not be as fast as specialized syntax crates like rslint/swc.
The POC code resides in https://github.com/HerringtonDarkholme/vue-compiler/tree/main/crates/ref_transform
Reference: Variable Scope: https://github.com/brendanzab/moniker Rust Tree Matcher: https://github.com/flatt-security/shisho JS Tree Matcher: https://github.com/thx/gogocode OCaml Tree Matcher: https://semgrep.dev/
@HerringtonDarkholme
I tried using both RSLint and SWC for the analysis and transformation parts. While doing analysis is somewhat easier with RSLint, I found its Indel
API very quirky to use.
As for SWC, I actually implemented the analysis using it, and it is surprisingly a breeze. Just make sure you do not use Visitor
APIs, as they make things way too complicated.
I mainly use the swc_core
crate, which makes the binary size really manageable. The numbers are (for --release
):
swc_ecma_codegen
, swc_ecma_parser
, swc_core
and lazy_static
. You can check the Cargo.toml;lightningcss
included (Cargo.toml).As per performance, I did some micro-benchmarks using Criterion and found SWC to reliably perform 30%-50% better than RSLint. And while SWC was doing some transformations, RSLint was only parsing and not applying any transforms.
Here you can find the test suite of how analysis performs: https://github.com/phoenix-ru/rust-vue-compiler/blob/1d02ab364e222358ec9fdd713daf9afd7f92848d/crates/fervid_script/src/script_legacy/mod.rs#L110-L736
But compiling swc is really slow. It slows down my development iteration speed much. Back then Rust Analyzer cannot support swc's macro as well.
Further, I personally don't like deeply nested pattern matching and dealing with SWC related JsWord stuff. See https://github.com/phoenix-ru/rust-vue-compiler/blob/1d02ab364e222358ec9fdd713daf9afd7f92848d/crates/fervid_script/src/script_legacy/utils.rs#L104-L149 Compared to https://github.com/HerringtonDarkholme/vue-compiler/blob/main/crates/sfc/src/script/vanilla_script.rs#L125-L158
Yes, SWC is pretty slow to compile. I kind of overlook that because of the hardware (takes 2.5 seconds dev mode for me).
My main idea behind SWC is that I intend to rewrite the code generation part of the compiler by composing the SWC's blocks. Here's a verbose but working example (this is a rough local draft I committed): https://github.com/phoenix-ru/rust-vue-compiler/blob/06f7e4abc66d5f111c9c515136a37e22776102e7/crates/fervid_script/src/experimental_compile.rs.
I quite literally hacked this in a lunch break and it already produces a pretty solid code for me (just look at "onClick"
, this would have been a bug in manual implementation). With some tweaks, I believe it is a pretty robust tool to convert IR -> JS.
_openBlock(_component_abc_def, {
"modelValue": "_ctx.modelValue",
"onUpdate:modelValue": "$event => ((_ctx.modelValue) = $event)",
"modelModifiers": "{lazy: true}",
"another-model-value": "_ctx.modelValue",
"onUpdate:anotherModelValue": "$event => ((_ctx.modelValue) = $event)",
"another-model-valueModifiers": "{trim: true}",
"test-bound": "_ctx.bar+_ctx.baz",
"disabled": "disabled",
"onClick": '_withModifiers(() => {}, ["prevent"])',
"onHello": "_ctx.world",
"class": ""
})
I really like the fact that this AST can be manipulated at any stage; this way I don't need to worry about calls like _withDirectives()
where the node has to be compiled after processing the directive. And because <template>
is compiled as an AST, inlining it simply means adding it to ReturnStmt
.
Additionally, I plan to create a plugin system to support manipulations on AST from external plugins.
Pattern matching is pretty tedious, yet imho it's more idiomatic to Rust (only an opinion). It drives me crazy sometimes (that's why I have utils.rs
which you linked), but I like rust-analyzer
s "Fill match arms" feature, and it actually gives some solid insights and I already identified a couple of bugs in the existing Vue SFC compiler.
Regarding JsWord
, I unfortunately don't get a reason why you use Range<usize>
. I mean it's convenient for source maps and pretty performant, but how do you intend to transform the AST?
JsWord
is a basic Atom
from the string_cache
crate, i.e. an interned string. In my code I clone JsWord
s all around the place, yet I don't see any implications after running the test suite which analyzes 86 test cases (43 snippets * JS+TS) in a negligible time:
running 15 tests
# omitted
test result: ok. 15 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
real 0m0,008s
user 0m0,014s
sys 0m0,006s
Double-checked. In your example, do you handle the "Shorthand" case? I.e. a
in { a }
?
I couldn't find the analogy, but I also don't know how ast_grep_core
works (btw. amazing tool :fire:)
TLDR: Rslitn is still better than swc.
The code changes are in swc branch and rslint branch.
swc contains a lot of dependencies regardless if they are relevant to core parsing. Such bloated dependencies place huge burden to compiling, slowing rust-analyzer to almost freezing. swc's docs and examples re scarce. The only working example is https://rustdoc.swc.rs/swc_ecma_parser/. Looking to the code is also hard. The abstraction and module organization is, well, hazy at least to the uneducated. Peeking the definition is hard, if possible, given the massive usage of macros. Alas, the macro is also the perpetrator of the sluggish compilation. :/ Using swc is not a nice journey, actually. Looking at the example above, it immediately requires several crates other than the parse.
common
,ast
,atom
,visit
and blahblah. And the coreimpl Visitor
has 200+ macro generated methods to implement without one single line of documentation. The usability is poor... And the output is large, merely importing swc pushes the binary size to 33MB.Rslint at least has more comments and documentation than swc. It's underlying crate, rowan, also has docs. So it might be a better choice. Rslint's dependencies are also more lightweight. Rslint's source code is also simple and clear, compared to swc. Understanding Green/Red tree does require some learning. But it is fine. The binary size is 10MB after importing Rslint, one third of swc.