wooorm / xdm

Just a *really* good MDX compiler. No runtime. With esbuild, Rollup, and webpack plugins
http://wooorm.com/xdm/
MIT License
595 stars 18 forks source link

ESM in Gatsby #40

Closed thecodingwizard closed 3 years ago

thecodingwizard commented 3 years ago

Is it possible to use this library with Gatsby? I can't seem to import xdm in Gatsby.

Is it possible to use xdm with Gatsby as an alternative to gatsby-plugin-mdx?

wooorm commented 3 years ago

It looks like Gatsby doesn’t support actual ESM. How to use ESM with different tools is outside of the scope of this project. I found this old issue there: https://github.com/gatsbyjs/gatsby/issues/23705.

I’ll add a note in the readme to this gist: https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c.

thecodingwizard commented 3 years ago

Edit: Also see the comment below for multiple improvements to this post :)


For reference in case anybody else stumbles upon this and wants to replace gatsby-plugin-mdx with xdm -- I could not get any of the options working to properly import xdm in Gatsby (see error messages from the first post). I "resolved" this issue (if you can call this a solution lol) by compiling the xdm library with webpack and copy-pasting the resulting bundle into my project, then importing that in Gatsby. My webpack configuration was as follows:

import path from 'path';

export default {
    mode: 'production',
    entry: './index.js',
    output: {
        path: path.resolve('D:\\Code\\xdm', 'dist'),
        filename: 'bundle.js',
        library: {
            name: 'xdm',
            type: 'commonjs',
        },
    },
};

and this can be used in a Gatsby plugin with something like

const { xdm } = require("./xdm-bundle.js");

or, if using ts-node:

import { xdm } from './xdm';

Adding xdm to Gatsby was unfortunately somewhat harder than I expected it to be (the key issues were the conflict between ESM and CJS; the incompatibility with gatsby-remark-* plugins, especially the gatsby-remark-image plugin whose functionality couldn't be replaced with other remark plugins; and extracting information like frontmatter and table of contents).

In case it helps anyone else I'll just document what I did to replicate (some of) the functionality of gatsby-plugin-mdx (I didn't migrate mdx exports since my project didn't use it). I had typescript set up with ts-node; some minor syntax may need to be changed to get it working with regular js. Keep in mind that I did not really bother to make this migration maintainable since I'm hoping for a better solution in the future...

In the gatsby-node.ts file:

exports.onCreateNode = async api => {
  const {
    node,
    actions,
    loadNodeContent,
    createContentDigest,
    createNodeId,
  } = api;

  const { createNodeField, createNode, createParentChildLink } = actions;

  if (node.internal.type === `File` && node.ext === '.mdx') {
    const content = await loadNodeContent(node);
    const xdmNode = await createXdmNode(
      {
        id: createNodeId(`${node.id} >>> Xdm`),
        node,
        content,
      },
      api
    );
    createNode(xdmNode);
    createParentChildLink({ parent: node, child: xdmNode });
  }
}

The implementation of createXdmNode:

import { createContentDigest } from 'gatsby-core-utils';
import graymatter from 'gray-matter';
import remarkAutolinkHeadings from 'remark-autolink-headings';
import remarkExternalLinks from 'remark-external-links';
import remarkFrontmatter from 'remark-frontmatter';
import gfm from 'remark-gfm';
import { remarkMdxFrontmatter } from 'remark-mdx-frontmatter';
import remarkHtmlNodes from '../mdx-plugins/remark-html-nodes.js';
import remarkToC from '../mdx-plugins/remark-toc';
import getGatsbyImage from './wrapped-gatsby-img-plugin';
import { xdm } from './xdm';

export async function createXdmNode({ id, node, content }, api) {
  let xdmNode: any = {
    id,
    children: [],
    parent: node.id,
    internal: {
      content: content,
      type: `Xdm`,
    },
  };

  let compiledResult;
  const tableOfContents = [];

  const gatsbyImage = getGatsbyImage({
    ...api,
    xdmNode,
  });

  try {
    compiledResult = await xdm.compile(content, {
      remarkPlugins: [
        gfm,
        remarkFrontmatter,
        remarkMdxFrontmatter,
        [remarkToC, { tableOfContents }],
        gatsbyImage,
        remarkHtmlNodes,
      ],
      rehypePlugins: [],
    });
    compiledResult = String(compiledResult);
  } catch (e) {
    // add the path of the file to simplify debugging error messages
    e.message += `${node.absolutePath}: ${e.message}`;
    throw e;
  }
  compiledResult = compiledResult.replace(
    /import .* from "react\/jsx-runtime";/,
    ''
  );
  compiledResult = compiledResult.replace(
    `function MDXContent(_props) {`,
    'function MDXContent(_Fragment, _jsx, _jsxs, _props) {'
  );
  compiledResult = compiledResult.replace(
    'export default MDXContent',
    'return MDXContent'
  );
  compiledResult = compiledResult.replace('export const ', 'const ');

  // // extract all the exports
  // const { frontmatter, ...nodeExports } = extractExports(
  //   code,
  //   node.absolutePath
  // )

  const { data: frontmatter } = graymatter(content);
  xdmNode = {
    ...xdmNode,
    body: compiledResult,
    frontmatter,
    toc: tableOfContents,
  };

  // xdmNode.exports = nodeExports

  // Add path to the markdown file path
  if (node.internal.type === `File`) {
    xdmNode.fileAbsolutePath = node.absolutePath;
  }

  xdmNode.internal.contentDigest = createContentDigest(xdmNode);

  return xdmNode;
}

Some things to note:

remarkToC implementation:

const mdastToString = require('mdast-util-to-string');
const Slugger = require('github-slugger');

module.exports = ({ tableOfContents }) => {
  const slugger = new Slugger();

  function process(node) {
    if (node.type === 'heading') {
      const val = {
        depth: node.depth,
        value: mdastToString(node),
        slug: slugger.slug(mdastToString(node), false),
      };
      tableOfContents.push(val);
    }
    for (let child of node.children || []) {
      process(child, curLang);
    }
  }

  return node => {
    process(node);
  };
};

There might be a neater way to extract data that doesn't involve making a fake plugin? Also you can get all heading nodes in a simpler way with another unified plugin that I forgot (my specific use-case was slightly more complicated and required information about other nodes as well, which is why the implementation above is recursive).

To get gatsby-remark-image working:

const interopDefault = exp =>
  exp && typeof exp === `object` && `default` in exp ? exp[`default`] : exp;

const getPlugin = ({
  xdmNode,
  getNode,
  getNodesByType,
  reporter,
  cache,
  pathPrefix,
  ...helpers
}) => {
  async function transformer(markdownAST) {
    const requiredPlugin = interopDefault(require('./custom-gatsby-img.js'));

    await requiredPlugin(
      {
        markdownAST,
        markdownNode: xdmNode,
        getNode,
        getNodesByType,
        get files() {
          return getNodesByType(`File`);
        },
        pathPrefix,
        reporter,
        cache,
        ...helpers,
      },
      {
        maxWidth: 832,
        quality: 100,
        disableBgImageOnAlpha: true,
      }
    );

    return markdownAST;
  }
  return [() => transformer, {}];
};

module.exports = stuff => getPlugin(stuff);

The second object passed into requirePlugin are the options for gatsby-remark-images. I think you can use this technique for other gatsby-remark-* plugins as well, though most of the other plugins have functionality that can be achieved by another xdm-compatible remark plugin.

There's still a small problem -- gatsby-remark-images generates type: html nodes (see #41). To get around this, either modify the gatsby-remark-images plugin or just create another remark plugin to convert type: html nodes to a JSX custom component that just renders HTML. Below is the implementation of remarkHtmlNodes:

module.exports = () => {
  function process(node) {
    if (node.type === 'html') {
      node.type = 'mdxJsxTextElement';
      node.name = 'RAWHTML';
      node.children = [
        {
          type: 'text',
          value: node.value,
        },
      ];
    }
    for (let child of node.children || []) {
      process(child);
    }
  }

  return node => {
    process(node);
  };
};

(above can be implemented better w/ a proper library). Also, make sure that any image assets you reference in your markdown files are loaded by gatsby-source-filesystem before your markdown files are loaded by gatsby-source-filesystem. So, in gatsby-config.ts, for plugins:

  {
    resolve: `gatsby-source-filesystem`,
    options: {
      path: `${__dirname}/src/assets`,
      name: `assets`,
    },
  },
  {
    resolve: `gatsby-source-filesystem`,
    options: {
      path: `${__dirname}/content`,
      name: `content`,
    },
  },

This will work, since assets (the images) are loaded before the markdown. However, flipping the order of the two will cause images to fail silently (but it will sometimes work during development, which causes major debugging headaches...)

To render the markdown returned from xdmNode.body:

import * as React from 'react';
import {
  Fragment as _Fragment,
  jsx as _jsx,
  jsxs as _jsxs,
} from 'react/jsx-runtime';
import { components } from './MDXComponents';

const Markdown = (props: { body: any }) => {
  const fn = new Function(props.body)();

  return (
    <div className="markdown">{fn(_Fragment, _jsx, _jsxs, { components })}</div>
  );
};

export default React.memo(Markdown);

(I think there might be a better way to do this? see src/evaluate.js and src/run.js in the xdm repo)

In your MDX components, make sure to also include the RAWHTML component:

const RAWHTML = ({ children }) => {
  return <div dangerouslySetInnerHTML={{ __html: children }} />;
};

You might also need to create schema definitions for Xdm nodes:

exports.createSchemaCustomization = ({ actions }) => {
  const { createTypes } = actions;
  const typeDefs = `
    type Xdm implements Node {
      body: String
      fileAbsolutePath: String
      frontmatter: XdmFrontmatter
      isIncomplete: Boolean
      toc: TableOfContents
    }

    type XdmFrontmatter implements Node {
      id: String
      title: String
      author: String
      description: String
      prerequisites: [String]
      redirects: [String]
    }
  `;
  createTypes(typeDefs);
};

(some of these are specific to my project, adjust as needed)


gatsby-plugin-mdx also came with a loader that let you import .mdx files. You can achieve the same result by adding a custom webpack loader to gatsby-node.js; however, the given xdm/webpack.cjs bundler doesn't work due to ESM/CJS conflicts. I got around this by creating a custom webpack-xdm.js file that imported our custom xdm file:

const { getOptions } = require('loader-utils');
const { xdm } = require('./xdm');

module.exports = function (code) {
  const callback = this.async();
  xdm
    .compile(
      { contents: code, path: this.resourcePath },
      {
        remarkPlugins: [],
        rehypePlugins: [],
        ...getOptions(this),
      }
    )
    .then(file => {
      callback(null, file.contents, file.map);
      return file;
    }, callback);
};

Then, in gatsby-node.js:

exports.onCreateWebpackConfig = ({ actions, stage, loaders, plugins }) => {
  actions.setWebpackConfig({
    module: {
      rules: [
        {
          test: /\.mdx$/,
          use: [
            loaders.js(),
            {
              loader: path.resolve(__dirname, 'src/gatsby/webpack-xdm.js'),
              options: {},
            },
          ],
        },
      ],
    },
  });
};

Note that this loader doesn't let you use Gatsby's image processing. I believe (but haven't tried) that you get the image processing working by creating a wrapper around gatsby-remark-image similar to what we did in onCreateNode.


Again, this was mostly a proof-of-concept so I didn't bother to make the code neat/maintainable. Hopefully somebody will come up with a better solution to this soon :pray:

Useful links in case someone else wants to attempt this:

It's also possible to create a browser "playground" with xdm (and the performance is surprisingly good). See: https://github.com/cpinitiative/usaco-guide/blob/c885f4c1ec19c78a0ff18c5b1b474d1ad218ce7b/src/components/DynamicMarkdownRenderer.tsx

I don't have hard benchmarks, but my build time nearly (?) halved (in gatsby v3 and webpack 5 at least) after implementing these changes. Playground render performance improved by ~66%. I think I'm mostly bottlenecked by katex at this point (before the babel transforms from mdx were the primary bottleneck for me).

wooorm commented 3 years ago
thecodingwizard commented 3 years ago

Thanks for the suggestions!

For remark-mdx-frontmatter: is there a neat way to extract just the frontmatter of an MDX file efficiently (ie. without having to compile the entire file)? An extension of this would be to extract the frontmatter + any exported values of the MDX files efficiently.

The use case for this is because optimally, during development, each MDX file would be compiled on-demand rather than compiling every MDX file when the development server starts, since compiling many MDX files can take a while (especially with extensive latex). However, the frontmatter + exported values of every MDX file would be extracted (ideally efficiently) when the development server starts, since this information is needed to generate page information.

Frontmatter can be extracted with graymatter. I haven't figured out how to efficiently extract exported values though.

If there isn't a neat way to handle this, it's not a problem -- XDM is fast enough that the this optimization isn't that important, and Gatsby caches nodes already anyway, so the performance difference is negligible after the first run. I'm mostly just curious to see if this was possible :P

wooorm commented 3 years ago

For remark-mdx-frontmatter: is there [1] a neat way to extract just the frontmatter of an MDX file efficiently (ie. without having to compile the entire file)? [2] An extension of this would be to extract the frontmatter + any exported values of the MDX files efficiently.

[1] that’s what frontmatter is: it‘s static, you don’t need to know if the file is MDX, or markdown, or something entirely different. The frontmatter can be accessed without compiling the file. And graymatter (or vfile-matter) can do that.

[2] is done by remark-mdx-frontmatter: it turns frontmatter into exports, which similar to all the other exports, can then be accessed. For the “efficiently” part though: MDX is a language that compiles to JavaScript (so make sure to compile less). Once you have the JavaScript, the JS engine should be smart enough to only evaluate export const title = 'whatever' if you’re importing import {title} from './content.mdx'.

kimbaudi commented 3 years ago

I'm facing a similar issue with the latest unist-util-visit v3.0.0 which is ESM only. I tried to figure out how to use it in Gatsby (using esm, adding "type"="module"), but obviously it does not work. I don't think it is possible to use ESM only packages in Gatsby at the moment. Gatsby currently doesn't support ESM.

wooorm commented 3 years ago

@kimbaudi How to use ESM is outside the scope of this project. The comments here show a way to make it work. Did you try them?

kimbaudi commented 3 years ago

@wooorm I tried compiling unist-util-visit with webpack and using the bundle, but I couldn't get it to work. I'll probably have to try again to be sure. I also tried dynamic imports, but I haven't figured out how to get it working with Gatsby since dynamic imports are asynchronous and Gatsby is using require which is synchronous.

module.exports = async () => {
  const { visit } = await import('unist-util-visit')
}

There is probably a way to get it working and I just haven't figured it out. Thanks for all your work.

wooorm commented 3 years ago

I don’t quite understand what unist-util-visit has to do with xdm?

And, in the thread above, there are references to a project that has what you want working. So the solution you’re looking for is linked above?

kimbaudi commented 3 years ago

both xdm and unist-util-visit are ESM only and I was trying to get unist-util-visit to work with Gatsby as @thecodingwizard was trying to get xdm to work with Gatsby.

I understand how to use ESM is outside the scope of this project. I was just commenting that I am facing similar issue w/ unist-util-visit.

wooorm commented 3 years ago

ahh, okay! I was assuming this was about xdm 😅