mdx-js / mdx

Markdown for the component era
https://mdxjs.com
MIT License
17.43k stars 1.14k forks source link

Unable to parse custom heading ids #2485

Closed kachkaev closed 3 months ago

kachkaev commented 3 months ago

Initial checklist

Affected packages and versions

@mdx-js/esbuild@3.0.1

Link to runnable example

No response

Steps to reproduce

  1. Open MDX Playground: https://mdxjs.com/playground/
  2. Add {#custom-id} to the header:
    - # Hello, world!
    + # Hello, world! {#custom-id}

Expected behavior

The document parses. Ideally, {#custom-id} is understood as custom heading id, according to Extended markdown syntax. If this kind of parsing is out of scope, at least {#custom-id} remains a part of the headline and does not break the parsing of the whole document.

Actual behavior

Error at 1:18: Could not parse expression with acorn

Runtime

Other (please specify in steps to reproduce)

Package manager

Other (please specify in steps to reproduce)

OS

Other (please specify in steps to reproduce)

Build and bundle tools

Other (please specify in steps to reproduce)

Additional context

I am here after trying to upgrade from contentlayer to contentlayer2 (the fork upgraded mdx dependencies). I was able to use remark-custom-heading-id previously, but it did not work after the upgrade. An alternative plugin (remark-heading-id) seems to suffer from the same issue. Custom ids are quite important for documents in non-latin alphabets, so it’d be great to figure out how to make them work in the latest mdx ecosystem.

remcohaszing commented 3 months ago

In MDX, what’s between { and } needs to be a valid JavaScript expression. Since #custom-id isn’t valid JavaScript, this is a parsing error.

remark-custom-heading-id extends markdown syntax with something custom. The problem with extending syntax is that it can conflict with other syntax, as is the case here. I suggest one of following solutions:

  1. Use rehype-slug instead of remark-custom-heading-id.
  2. Use JSX
    <h1 id="custom-id">Hello, world!</h1>
  3. Upvote the request for remark-custom-heading-id to support another delimiter. Perhaps even send a pull request.

Personally I recommend going with option 1 or 2.

wooorm commented 3 months ago

You can also use JSX:

# Some heading {<a name="custom-id" />}

MDX intentionally has a strictly defined syntax mixing CommonMark with ESM/JSX/expressions, and will not deviate out of the box.

kachkaev commented 3 months ago

Thanks folks! Both suggestions make sense. Creating a special case for {#...} in headings would be quite ugly for MDX indeed, so using this syntax is not an option indeed.

I still haven’t wrapped my head about generating toc for your examples, will probably pause my investigation for now. If anyone has ides, please share! This is what I am using to extract toc: https://github.com/shadcn-ui/ui/blob/13d9693808badd4b92811abac5e18dc1cddf2384/apps/www/lib/toc.ts#L78

This code ignores <h2 id="custom-id">...</h2> and does not pick id from ## ... {<a name="custom-id" />}.

wooorm commented 3 months ago

Ah, yes, that code there looks a) not tooo smart, b) focussed on plain static markdown.

As MDX/JSX are JavaScript, which is evaluated somewhere, to properly support MDX you’d need to handle that. E.g., what if there was an id={'x-' + Date.now()}?