mdx-js / mdx

Markdown for the component era
https://mdxjs.com
MIT License
17.43k stars 1.14k forks source link

Headings should generate ids for deep-linking #810

Closed keithjgrant closed 3 years ago

keithjgrant commented 4 years ago

Subject of the feature

Add ids to headings for deep linking.

Problem

Most markdown implementations automatically add a slugified id to all headings (h1–h6) for deep-linking support. I would expect MDX to do the same.

Expected behavior

Markdown such as ## More information should render as <h2 id="more-information">More information</h2>. This way, links with this hash deep link to the heading (e.g. https://example.com/foo#more-information).

Alternatives

Similar behavior can be accomplished by creating custom components that add the id, but this requires additional configuration.

wooorm commented 4 years ago

There's a rehype plugin for that! rehype-slug. Read more about plugins here: https://mdxjs.com/advanced/plugins

johno commented 4 years ago

Yeah, I think we shouldn't do this by default in MDX since it's pretty opinionated, but we should definitely document this in the guides.

slorber commented 4 years ago

Hey

I'm exploring how to bring i18n support to https://github.com/facebook/docusaurus where MDX files will be translated (through git or a saas like Crowdin/Transifex).

Auto-generated ids might be a problem in practice, because changing the text does change the id, which may break linking from other pages. This means that a translating changing some heading would have to go through all the other docs of the site and update all the anchor links everywhere.

Having a way to choose the id would be useful.

Related discussions about ReactJS translation system and anchor links: https://github.com/reactjs/reactjs.org/issues/1605#issuecomment-458816106 https://github.com/reactjs/reactjs.org/issues/1605#issuecomment-458819231 https://github.com/ethereum/ethereum-org-website/issues/272 https://github.com/reactjs/reactjs.org/pull/1636/files

Looks like it's possible to do as a remark plugin https://github.com/reactjs/reactjs.org/pull/1636/files#diff-9a50f5acbfae877b9f3d761e266b0339R21

Yet just wanted to expose the i18n usecase so that you might want to include core support? (btw I have similar usecases for tunability to provide targets to links, size to images etc... all can probably be solved with remark)

tillkolter commented 4 years ago

@keithjgrant This can be accomplished by using remark/plugins.

I accomplished it, with the following webpack setup:

const remarkSlugs = require('remark-slug');
const rehypeHtml = require('rehype-stringify');

...
  rules: [
    ...
      {
        test: /\.mdx?$/,
        use: [
          {
            loader: '@mdx-js/loader',
            options: {
              remarkPlugins: [remarkSlugs],
              rehypePlugins: [rehypeHtml],
            },
          },
        ]
      }
  ]
hhimanshu commented 4 years ago

@keithjgrant This can be accomplished by using remark/plugins.

I accomplished it, with the following webpack setup:

const remarkSlugs = require('remark-slug');
const rehypeHtml = require('rehype-stringify');

...
  rules: [
    ...
      {
        test: /\.mdx?$/,
        use: [
          {
            loader: '@mdx-js/loader',
            options: {
              remarkPlugins: [remarkSlugs],
              rehypePlugins: [rehypeHtml],
            },
          },
        ]
      }
  ]

Do you have an example repo that demonstrate the use of the deep-linking? A codesandbox example? I am interesting in knowing how to do this. Thank you

wooorm commented 3 years ago

Hi all! I’m going to close this as it can be accomplished with a plugin.

The internationalization context @slorber brought up is interesting, but IMO could also be in a plugin.

danon commented 2 years ago

But what about custom ids for headings?

I would like to do something in index.mdx

## Heading do something {#custom-id}

Or some other syntax.

slorber commented 2 years ago

@Danon we now use this exact syntax in Docusaurus:

https://docusaurus.io/docs/markdown-features/toc#heading-ids

Implementation is here: https://github.com/facebook/docusaurus/blob/main/packages/docusaurus-mdx-loader/src/remark/headings/index.ts

SimonSomlai commented 2 months ago

I used this inside the components prop:

    ...(["h1", "h2", "h3", "h4", "h5", "h6"].reduce(
                          (acc, heading) => ({
                            ...acc,
                            [heading]: (props: any) => {
                              const { children } = props;
                              const id = kebabCase(children);
                              return React.createElement(
                                heading,
                                {
                                  id,
                                },
                                React.createElement(
                                  "a",
                                  {
                                    href: `#${id}`,
                                  },
                                  children
                                )
                              );
                            },
                          }),
                          {}
                        ) as any),

to generate markup like this;

<h1 id="why-should-you-keep-notes"><a href="#why-should-you-keep-notes">Why should you keep notes? 🤔</a></h1>
karlhorky commented 1 month ago

In the React.dev docs (eg. on the Tutorial: Tic-Tac-Toe page), another userland implementation of "headings syntax generating ids" by @jaredpalmer and other collaborators, based on gatsby-remark-autolink-headers, which enables JSX-compatible syntax:

# What are you building? {/*what-are-you-building*/}

Screenshot 2024-08-08 at 15 41 40

Source:

https://github.com/reactjs/react.dev/blob/6671ba7b77142a73184caf863890d5ff20f2361b/plugins/remark-header-custom-ids.js

/**
 * Copyright (c) Facebook, Inc. and its affiliates.
 */

/*!
 * Based on 'gatsby-remark-autolink-headers'
 * Original Author: Kyle Mathews <mathews.kyle@gmail.com>
 * Updated by Jared Palmer;
 * Copyright (c) 2015 Gatsbyjs
 */

const toString = require('mdast-util-to-string');
const visit = require('unist-util-visit');
const toSlug = require('github-slugger').slug;

function patch(context, key, value) {
  if (!context[key]) {
    context[key] = value;
  }
  return context[key];
}

const svgIcon = `<svg aria-hidden="true" height="16" version="1.1" viewBox="0 0 16 16" width="16"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"></path></svg>`;

module.exports = ({icon = svgIcon, className = `anchor`} = {}) => {
  return function transformer(tree) {
    const ids = new Set();
    visit(tree, 'heading', (node) => {
      let children = [...node.children];
      let id;
      if (children[children.length - 1].type === 'mdxTextExpression') {
        // # My header {/*my-header*/}
        id = children.pop().value;
        const isValidCustomId = id.startsWith('/*') && id.endsWith('*/');
        if (!isValidCustomId) {
          throw Error(
            'Expected header ID to be like: {/*some-header*/}. ' +
              'Instead, received: ' +
              id
          );
        }
        id = id.slice(2, id.length - 2);
        if (id !== toSlug(id)) {
          throw Error(
            'Expected header ID to be a valid slug. You specified: {/*' +
              id +
              '*/}. Replace it with: {/*' +
              toSlug(id) +
              '*/}'
          );
        }
      } else {
        // # My header
        id = toSlug(toString(node));
      }

      if (ids.has(id)) {
        throw Error(
          'Cannot have a duplicate header with id "' +
            id +
            '" on the page. ' +
            'Rename the section or give it an explicit unique ID. ' +
            'For example: #### Arguments {/*setstate-arguments*/}'
        );
      }
      ids.add(id);

      const data = patch(node, 'data', {});
      patch(data, 'id', id);
      patch(data, 'htmlAttributes', {});
      patch(data, 'hProperties', {});
      patch(data.htmlAttributes, 'id', id);
      patch(data.hProperties, 'id', id);
    });
  };
};
karlhorky commented 1 month ago

@wooorm with the continued interest in this feature (eg. #1279 #2042 #2485) would support for the inline comment version be re-considered as a first-class MDX feature? (maybe with whitespace stripping from start + end too)

The plugins are nice, but I'm thinking especially in terms of integration with other tooling - eg. the MDX Analyzer mdx.validate.validateFragmentLinks feature, which would benefit from a 1st-class MDX feature (or at least a "blessed" syntax).

remcohaszing commented 1 month ago

The “blessed” syntax is to generate IDs based on the same formula GitHub uses, which is matched by rehype-slug, rehype-autolink-headings, and markdown language service / MDX language server.

slorber commented 1 month ago

@remcohaszing generating IDs from heading text is bad idea as soon as you start localizing content (cf my previous comment here: https://github.com/mdx-js/mdx/issues/810#issuecomment-677731097)

For example if I translate ## My Heading to french ## Mon Titre, you have many references to fix coming from other pages (that may not even me localized yet). For Docusaurus, this means that a translator translating a heading will get many broken link errors just after doing so. Having to change anchors everywhere while translaing is painful. It's even worse when using SaaS translation tools like Crowdin, where the translator does not get immediate feedback, and it becomes my responsibility to fix all the broken anchors on localized sites.

Also when switching language with a language dropdown, the original/localized anchors won't match and you won't keep the scroll position.

I still believe having hardcoded anchor IDs co-located to MDX headings is useful. That's why for Docusaurus we created a CLI that automatically add those hardcoded IDs to the headings (based on github-slugger): https://docusaurus.io/docs/next/markdown-features/toc#heading-ids

ChristianMurphy commented 1 month ago

From a language design perspective. MDX is JSX in CommonMark.

### Hello World {#my-explicit-id}

is not valid CommonMark, JSX, or a mix of the two. It is inconsistent with what the language is.


I still believe having hardcoded anchor IDs is important

Perfectly reasonable, and already supported through JSX:

<h2 id="my-heading">Mon Titre</h2>

You are of course welcome to create extensions to support other syntax features, not everything needs to be in MDX core.

karlhorky commented 1 month ago

From a language design perspective. MDX is JSX in CommonMark.

### Hello World {#my-explicit-id}

is not valid CommonMark, JSX, or a mix of the two. It is inconsistent with what the language is.

@ChristianMurphy thanks for the language points here!

What are the MDX team thoughts about the # What are you building? {/*what-are-you-building*/} syntax?

Perfectly reasonable, and already supported through JSX:

<h2 id="my-heading">Mon Titre</h2>

I guess why people are looking for another syntax in MDX is that JSX syntax gets a bit verbose.

(also especially once there are anchor tags being generated in the headings too, but I'm guessing the chances of that becoming a part of MDX core is lower)

wooorm commented 1 month ago

I still believe having hardcoded anchor IDs is important. That's why Docusaurus created a CLI to automatically generate and add those hardcoded ids to the MDX docs (based on github-slugger): docusaurus.io/docs/next/markdown-features/toc

a) If you‘re going to translate things, why not translate the URL? Why should a French speaking user not see “#mon-titre” when they hover over a URL? Why should English IDs be used? b) What guarantees that translated documents are completely translated and that they have the exact same headings, in the same order? c) great to automate this, why save this in files? Seems smarter to save this as metadata next to files?


What are the MDX team thoughts about https://github.com/mdx-js/mdx/issues/810#issuecomment-2275868756?

IIRC I came up with it.

I guess why people are looking for another syntax in MDX is that JSX syntax gets a bit verbose.

You are also commenting on old issues: JSX support in MDX wasn’t as good before 2 years ago. Plugins weren’t as good 4 and 5 years ago.

This is fine:

<a name="my-heading" />

# Mon Titre
remcohaszing commented 1 month ago

generating IDs feel like a bad idea as soon as you try to start localizing content

Without going to deep into the topic, I imagine there are multiple ways to tackle this. Maybe the IDs should match the original title. Maybe the IDs should match the translated title. Maybe they should be custom. I think it’s good that this is up to the user or framework to solve, possibly via plugins.

What are the MDX team thoughts about the # What are you building? {/*what-are-you-building*/} syntax?

I like it personally. That would be a great community plugin to have.

This is fine:

<a name="my-heading" />

This is deprecated

slorber commented 1 month ago

The reason we keep {#anchorId} is only historical and I am aware it's not valid MDX and we plan to migrate to valid syntax in the future (probably using {/* anchor */} heading comment, but maybe something else like an :anchor directive? 🤷‍♂️ )

a) If you‘re going to translate things, why not translate the URL? Why should a French speaking user not see “#mon-titre” when they hover over a URL? Why should English IDs be used? b) What guarantees that translated documents are completely translated and that they have the exact same headings, in the same order? c) great to automate this, why save this in files? Seems smarter to save this as metadata next to files?

Agree with everything, but it's more difficult to put it into practice. For now, Docusaurus has constraints that make our job and translation job simpler, assuming that all localized sites have a quite similar structure (routes), and keeping permalinks/anchors unlocalized.


This is fine:

<a name="my-heading" />

# Mon Titre

This works, but I'm not sure our user base will be happy if we recommend that solution. Also wondering if it's accessible, and if it doesn't produce a scroll offset on anchor link click compared to the heading ID. And maybe in this case it's useful to avoid auto-generating the id based on heading text to avoid a duplicate anchor leading to the same heading.


Maybe the IDs should match the translated title. Maybe they should be custom. I think it’s good that this is up to the user or framework to solve, possibly via plugins.

Note I'm just commenting here to reiterate the importance for Docusaurus to have fixed IDs instead of generated ones. I don't expect the MDX team to do anything and add a custom syntax, a userland solution is fine.

From a technical perspective generating ids is probably fine. From a translator's perspective, or the perspective of someone who manages large localized documentation sites, I think it makes things more complicated.

wooorm commented 1 month ago

This is deprecated

This works, but I'm not sure our user base will be happy if we recommend that solution.

A bit off topic, but it’s in “16.1 Obsolete but conforming features” (emphasis mine) 😅😅

I was sharing this more as another way to go about it. Such markup (a[name]) works here on GH, as in, places that generate IDs for heading automatically. For when you want multiple IDs around a place. Or different IDs. Ref: https://github.com/d3/d3-geo-polygon/blob/8414fe5e9481c69ca765785d443ca70b3baacebc/README.md?plain=1#L45