Closed wooorm closed 3 years ago
Estree has the drawback of being a fragmented ecosystem: there are no nice parsers that support comments; there are no tree-wakers or compilers that support JSX
ESLint's parser and walker have solid ESTree + Comment + JSX support https://github.com/eslint/espree https://github.com/eslint/eslint-visitor-keys
Prettier has espree with Comment + JSX support for code gen https://github.com/prettier/prettier/blob/902d524d2f1776efe0b110c1a24813d4d7fcb9d0/src/language-js/printer-estree.js escogen is close to having ESTree + JSX support https://github.com/estools/escodegen/pull/391
Coming from the perspective of personally using MDX more as a build tool than as a runtime component, and liking both using proposals and typescript features. I'm drawn more towards babel, having the ability to parse new syntax, having the option to support typescript syntax, and the broad support for babel within node/javascript tools are a draw. Because of mostly using it as a build tool, bundle size is less of a priority for me.
If we have to pick just one, I'd lean babel.
That being said, do we need to pick just one? Could the JavaScript parsing strategy be made pluggable?
Offering another consideration, if bundle size is the primary goal. Acorn may not be the smallest option, wasm can pack smaller than JS, for example https://bundlephobia.com/result?p=@swc/core@1.2.40 and still allows for custom transforms if needed https://swc.rs/docs/usage-plugin or other estree like javascript based parsers such as https://github.com/meriyah/meriyah and https://github.com/KFlash/seafox
/cc @ChristopherBiscardi since this approach has some potential tie ins to https://github.com/mdx-js/rust
edit: correction bundlephobia ignores wasm, the library may be faster, but it is not smaller https://unpkg.com/browse/@swc/wasm@1.2.40/
Thanks for all this research folks! I'd lean towards something smaller than Babel but I'm not very opinionated there. There are lots of client-side usages of MDX that won't go away, and Babel is pretty huge and pretty slow in comparison to other options. Considering we're mostly only using Babel for internals we could port it away without users really needing to know the difference.
Also, with wooorm's new JSX parsing, we can drop a bunch of the internals we use and manipulate the AST directly!
@ChristianMurphy I definitely wouldn't hold up any changes here based on the work in /rust. If our priority is small, then wasm is probably not the answer at the moment. swc is what I'm planning to use for /rust's js parsing and we could invest there more in the future but it's not a solution for today's in-browser use cases IMO.
that said, swc is hella faster than babel in my experience from working with it in toast (via the Rust APIs), and will work well for node-backed stuff if we're looking for a speed boost at some point in the future (TBD, caveats apply, /rust is an experiment, etc)
ESLint's parser and walker have solid ESTree + Comment + JSX support [...] escogen is close to having ESTree + JSX support [...] — @ChristianMurphy
espree seems to be a tiny wrapper around acorn and acorn-jsx 🤔 And a year old stalled PR is not really “close” 😅 Those visitor keys are great btw! Especially as espree is ± the same ast as acorn + acorn.jsx!
Porting our internals from Babel to estree is not a lot of work. Three small plugins: https://github.com/mdx-js/mdx/blob/68ff02c8129e2922f48b59bf51f4b967d248f397/packages/mdx/mdx-hast-to-jsx.js#L6-L8.
For a nice JSX serializer, we could look into adding that to either escodegen/astring/or whatever else is nice.
But as we’re thinking of compiling JSX away, that’s not needed. Rather, forking babel-helper-builder-react-jsx-experimental
for estree seems to be the way to go (not sure about Vue though...).
Subject of the discussion
With https://github.com/mdx-js/mdx/pull/1382, we now have a JavaScript syntax tree.
The tree starts out in estree: as markdown + mdx.js is parsed simultaneously, I needed a JavaScript parser in
micromark-extension-mdxjs
, and I chose a small and fast one: acorn. Which comes with estree. Acorn is small, 30kb minzipped. acorn-jsx is 4kb.astring
(a generator) is also 4kb.Previously, in this project, we used Babel for plugins. Babel is giant.
@babel/core
, which has methods to run Babel plugins, is like 220kb minzipped.@babel/generator
is 63kb.@babel/parser
is 60kb.@babel/traverse
is 165kb (it includes both the parser and the generator).Estree has the drawback of being a fragmented ecosystem: there are no nice parsers that support comments; there are no tree-wakers or compilers that support JSX. And importantly, as as we use JSX, we’d want to turn JSX into function calls (React/preact/vue), but those are all Babel plugins. We could use estree but then users would still need to run Babel afterwards.
Babel has the drawback of being giant and slow. But the good thing is that the JSX -> JS compilers all live there.
Problem
What should we go with? We can’t turn JSX -> JS unless we’re using Babel (well, we could, the babel plugin to turn JSX ->
_jsx()
/React.createElement
is 800l). Most users probably want to use Babel plugins to turn their fancy features into whatever. An estree-only system as a base for MDX would be ✨✨✨.@mdx-js/runtime
is now 350kb minzipped. That could go down to 100kb or less?