shikijs / shiki

A beautiful yet powerful syntax highlighter
http://shiki.style/
MIT License
9.76k stars 357 forks source link

`@shikijs/rehype`: keep hast element `data` and `properties` #629

Open remcohaszing opened 6 months ago

remcohaszing commented 6 months ago

Clear and concise description of the problem

With remark-rehype / MDX, the following markdown:

```js fileName="hello.js"
console.log('hello')

generates the following hast:

```js
{
  type: 'element',
  tagName: 'pre',
  properties: {},
  children: [
    {
      type: 'element',
      tagName: 'code',
      properties: {
        className: 'language-js'
      },
      data: {
        meta: 'fileName="hello.js'
      },
      children: [
        {
          type: 'text',
          value: "console.log('hello')",
        }
      ]
    }
  ]
}

Given this has, Shiki replaces the pre element with a newly generated pre element. In this process the node’s data and properties are lost. It would be nice if Shiki keeps this information, so that other rehype plugins, such as rehype-mdx-code-props can use it. Likewise it could even be useful to preserve positional information, which is used for example by React devtools.

Suggested solution

It looks like some logic is needed in https://github.com/shikijs/shiki/blob/main/packages/core/src/code-to-hast.ts to be able to handle an existing pre and code element.

Alternative

No response

Additional context

No response

Validations

Contributes

antfu commented 5 months ago

Sure, PR welcome!

karlhorky commented 2 months ago

I was also looking for the meta information from the code fence (the fileName="hello.js" part), and I found that there is code that already handles the node.data.meta to some degree:

https://github.com/shikijs/shiki/blob/4c1de346d6e373cef3599474e2d8d35d2d98727e/packages/rehype/src/core.ts#L117-L128

But it only parses the meta string and passes the props if a custom parseMetaString is specified, as shown in the tests:

https://github.com/shikijs/shiki/blob/4c1de346d6e373cef3599474e2d8d35d2d98727e/packages/rehype/test/index.test.ts#L21-L28

eg. I was able to get pre.props.fileName to appear by configuring like this:

next.config.js

export default withMDX({
  options: {
    remarkPlugins: [],
    rehypePlugins: [
      [
        rehypeShiki,
        {
          theme: 'dark-plus',
          addLanguageClass: true,
          parseMetaString: (str) => {
            return Object.fromEntries(
              str.split(' ').reduce((prev, curr) => {
                const [key, value] = curr.split('=');
                const isNormalKey = /^[A-Z0-9]+$/i.test(key);
                if (isNormalKey) prev = [...prev, [key, value || true]];
                return prev;
              }, []),
            );
          },
        },
      ],
    ],
  },
})(config);

Would it be an acceptable smaller change to set parseMetaString() to this function from the tests by default, if it's not specified in the config?

I don't mean to invalidate the idea of keeping .data and .properties around for interop - I think that should be done as a separate change too.