mdx-js / mdx

Markdown for the component era
https://mdxjs.com
MIT License
17.71k stars 1.14k forks source link

Pass options to `remark-mdx` #2208

Closed y-nk closed 1 year ago

y-nk commented 1 year ago

Initial checklist

Problem

There's currently no way to pass options to remark-mdx although it supports a bunch of options.

The reason is here:

https://github.com/mdx-js/mdx/blob/600b12ade3c1cd5bcbca9afc160b8456aa33a82e/packages/mdx/lib/core.js#L103-L105

which could help make the options flow to:

https://github.com/mdx-js/mdx/blob/600b12ade3c1cd5bcbca9afc160b8456aa33a82e/packages/remark-mdx/index.js#L20-L24

and eventually down to: https://github.com/micromark/micromark-extension-mdxjs

Solution

Add an optional remarkMdxOptions: {/* the type of options */} into the ProcessorOptions, and make it available everywhere.

Alternatives

There's no alternative. I'm willing to make the PR if this feature request is considered minor enough to pass.

wooorm commented 1 year ago

Which options do you want? The alternative is: do not let people change how the MDX format works at this level of abstraction. Options should likely not be passed. It will instead likely break things.

y-nk commented 1 year ago

i'm willing to pass options to acorn. actually i wished it to have pre remarkMdx processing but i understand it's not an option. the reason being that i want to modify the source of my mdx to allow me to write:

<style> * { color: red; }</style>

and decorate it to become jsx compliant (such as):

<style jsx>{` * { color: red; } `}</style>

the modification is subtle enough to work properly, but there's no insertion point for me in there (i already tried a remark plugin)

wooorm commented 1 year ago

~~You can already write both inputs you show? And you can use plugins to turn both inputs into each other?~~ Wait, you mean to write invalid JSX, and then turn it into something else? How would an acorn option solve that?

y-nk commented 1 year ago

yes, i'd like to write invalid jsx (but valid html) and have a chance to correct the input between remarkParse and remarkMdx

I've been looking closely at the options i could pass, and from all of them only an "acorn compliant" parser instance would maybe give me some chance to do something ; it's not guaranteed tho, and most unlikely to succeed i must admit... but there's no other entrypoint for me in between remarkParse and remarkMdx, so i don't see any other option.

this would still be extremely useful since we could extend the markdown syntax as we please (for a private purpose) which can't be done at the moment. at the moment, some syntax would work and some would create a parse error (notably if you involved curly braces, but not only)

i understand the jsx syntax, and what mdx is solving tho, so i'm not asking to change the goal of those. i'm simply willing to hand proper jsx to remarkMdx but sadly there's no entrypoint.

note that i could rewrite my own @mdx/rollup, and then my own @astrojs/mdx integration for this. it seems a lot of code for something that could be done at this exact .use(remarkMdx) and which could maybe benefit other people.

wooorm commented 1 year ago

No acorn option will give you what you want. a) Acorn does not have such an option, b) acorn isn’t used for the JS(X) you show.

but there's no other entrypoint for me in between remarkParse and remarkMdx, so i don't see any other option.

Further remark plugins have access to the parser extensions too: you can remove previously added extensions. You can change them. You can add more.

this would still be extremely useful since we could extend the markdown syntax as we please [...] which can't be done at the moment

You can already do so, you can integrate with the markdown parser. You likely shouldn’t though: https://github.com/micromark/micromark#extending-markdown.

at the moment, some syntax would work and some would create a parse error (notably if you involved curly braces, but not only)

This is by design.

i'm simply willing to hand proper jsx to remarkMdx but sadly there's no entrypoint.

This is not simple, because it is impossible. You also probably don‘t need to do what you are asking for.

note that i could rewrite my own @mdx/rollup, and then my own @astrojs/mdx integration for this

There is a reason Astro stopped maintaining its own flavor of MDX-like markdown: it’s extremely complex. I don‘t recommend you do so.

it seems a lot of code for something that could be done at this exact .use(remarkMdx) and which could maybe benefit other people

Yes, forking all of remark and MDX is a lot of work. But your proposed solution doesn’t solve what I believe to be your root question. It can’t.


If you don’t want JSX, you don’t need the MDX format. You can use the markdown format. In markdown, you can write HTML. When you use rehype-raw, like so, this is supported in these projects. You can then still turn them into components by building those from what used to be nodes for simple elements, into nodes to represent components: https://github.com/syntax-tree/mdast-util-mdx-jsx#syntax-tree.


Closing as no acorn option will allow the syntax you are asking for.

y-nk commented 1 year ago

@wooorm thanks for taking the time to answer each point by point. ~Could you eventually give me some pointers to go further in this direction:~

~Further remark plugins have access to the parser extensions too: you can remove previously added extensions. You can change them. You can add more.~

~anything is appreciated, from google keywords to links. i had no clue this was possible, i would have tried this otherwise.~

I've console.log(this) in my plugin function and found the attachers[], which is more than enough. thanks a lot :)

wooorm commented 1 year ago

attachers is likely not what you want. Those are plugins, that are already attached. The data is instead probably what you want, with i.e., data.micromarkExtensions

y-nk commented 1 year ago

@wooorm thanks a lot for your patience and guidance. i've been spending the holidays trying to learn the api but so far i didn't achieve anything.

The way i'm trying is by loading a remark plugin which modifies the data(). micromarkExtensions. this is because astro's options (which are basically a @mdx/rollup wrapper) seems not to allow passing extensions (and htmlExtensions) down to micromark.

something along the lines of:

function tokenize(effects, ok, nok) {
  console.log('tokenize')
  return function start(code) {
    console.log('start:', effects, code);
    return nok(code)
  }
}

function rewriteStyleJsxPlugin() {
  const data = this.data()

  // hardcoded for debug - not sure if text construct is good,
  // but just want to console.log something
  const arr = data.micromarkExtensions[0].text[123]

  // pushing the plugin there, 
  arr.unshift({ name: 'styleJsx', tokenize })
  console.log(arr)
}

// https://astro.build/config
export default defineConfig({
  integrations: [
    mdx({
      remarkPlugins: [
        rewriteStyleJsxPlugin
      ],
      extendPlugins: false,
    }),
  ],
});

I can see my function in the log of arr but nothing pops in the console after that. My thinking is that all of this arrives too late and the whole flow of micromark is already done when my remarkPlugin executes.

I've not yet followed the track of passing extension[] directly in the options (i've seen that all options are carefully passed as rest arguments, which is very convenient but hard to track) - but it seems Astro doesn't follow this method and I could be stuck quite fast.

wooorm commented 1 year ago

For your micro problem, just remove all data.micromarkExtensionss first and add your own later. Then you should have a console.log.


Maybe good to know: parsers are incredibly hard. It’s a full time job for me to maintain everything. And then other people are helping too. And then MDX is several parsers, several ASTs. Again, Astro has several people and they stopped their fork. Unless you want to spend 2 months* learning how micromark, mdast, and unified work, I really don’t recommend you to do this.

If you do want to spend a ton of time on this, I recommend first designing how your custom syntax works. Is there nesting in <style>? JS? More tags? Markdown? Is it JSX-like or HTML-like syntax? How do escapes or character references work? Does your format extend to other tags, or just style? Are there props or attributes? Block-level or inline-level? Etc.

* rough estimate

y-nk commented 1 year ago

Thanks again the directions are very much appreciated.


I've no doubt that parsers are hard. I felt it before from several reads on the subject, but navigating through your ecosystem sure taught me a lesson or two. It's incredible that you're patient enough to answer me :) To be frank regarding the scope of my problem, i really only require to wrap the content of the <style> tag with a combo of "curly braces/backticks" so that it's jsx compliant - of course i could do differently as well, but i'm taking this as a chance to learn from your code 🙏 It's a low priority/requirement and I already found an acceptable alternative (which is to specify the name of a css file in frontmatter) but again, the issue began as a quick hack but turned out to be a learning journey.

On this regard, it's an incredibly complex web of packages and I found that the structure provided in the mdxjs/mdx repo (with a monorepo) was quite convenient. If you were to need someone to execute grunt tasks of joining the micromark-mdx-* into the same monorepo structure, please let me know. I don't have a full time to dedicate but I'd be happy to help.

wooorm commented 1 year ago

One more idea: regex for <style> tags, take the value inside them (let’s call it x), and replace it with x.replace(/[^\n]/g, ' '). That way all the positional info remains intact. Afterwards you could patch those styles back into the document. You might get styles in code and strings too though.

it's an incredibly complex web of packages and I found that the structure provided in the mdxjs/mdx repo (with a monorepo)

It is indeed very complex, I don‘t think that can be solved. MDX is also simpler because a lot of the code is abstracted away into packages. There’s a ton of code for micromark and mdast, so much much more than MDX, I don’t think it fits into a maintainable monorepo.

All these different packages allow you to do what you want: you can swap out one part (micromark/mdast JSX), and still have the rest! It’s complex, all these APIs, but that’s what allows switching stuff out