micromark / micromark-extension-mdx-expression

micromark extension to support MDX or MDX JS expressions
https://unifiedjs.com
MIT License
11 stars 2 forks source link

Could not parse expression with acorn: Unexpected content after expression #9

Closed stevensacks closed 1 year ago

stevensacks commented 1 year ago

Initial checklist

Affected packages and versions

Latest

Link to runnable example

No response

Steps to reproduce

npm 9.6.6 node 18.15.0 bundler N/A (running script in node locally)

I read the link about this error, but it doesn't seem to apply. I think it's getting confused thinking it's Javascript curly braces and not just curly braces used in regular text.

https://mdxjs.com/docs/troubleshooting-mdx/#could-not-parse-expression-with-acorn-unexpected-content-after-expression

I have isolated the issue to the following text inside of an MDX file (this is from a documentation file I have no control over).

- The issue is assigned to {no one/a team/a member}.

I have tried replacing with a comma delimiter, same error:

- The issue is assigned to {no one, a team, a member}.

Any idea why this might be happening?

Expected behavior

Shouldn't crash on this string

Actual behavior

Crashes

Runtime

Other (please specify in steps to reproduce)

Package manager

Other (please specify in steps to reproduce)

OS

macOS

Build and bundle tools

Other (please specify in steps to reproduce)

wooorm commented 1 year ago

Hi! This is expected, as the docs you link point out. What to do, is also explained in those docs: https://mdxjs.com/docs/troubleshooting-mdx/#could-not-parse-expression-with-acorn-unexpected-content-after-expression

github-actions[bot] commented 1 year ago

Hi! This was closed. Team: If this was fixed, please add phase/solved. Otherwise, please add one of the no/* labels.

wooorm commented 1 year ago

I think it's getting confused thinking it's Javascript curly braces and not just curly braces used in regular text.

It’s not confused, it’s sure. All curly braces are expressions. Escape them if you mean text

stevensacks commented 1 year ago

@wooorm Thanks for your quick response.

To be specific, I am trying to process the Sentry documentation mdx files. Around 30 (out of 300ish) files throw errors.

https://github.com/getsentry/sentry-docs/tree/master/src/docs/product

Here are a few of the errors:

Page 'docs/product/accounts/getting-started/index' or one/multiple of its page sections failed.
[42:26: Could not parse expression with acorn: Unexpected character '#'] {
  reason: "Could not parse expression with acorn: Unexpected character '#'",
  line: 42,
  column: 26,
  position: {
    start: { line: 42, column: 26, offset: 3138 },
    end: { line: null, column: null }
  },
  source: 'micromark-extension-mdx-expression',
  ruleId: 'acorn'
}

Page 'docs/product/accounts/pricing' or one/multiple of its page sections failed
[36:2: Unexpected character `!` (U+0021) before name, expected a character that can start a name, such as a letter, `$`, or `_` (note: to create a comment in MDX, use `{/* text */}`)] {
  reason: 'Unexpected character `!` (U+0021) before name, expected a character that can start a name, such as a letter, `$`, or `_` (note: to create a comment in MDX, use `{/* text */}`)',
  line: 36,
  column: 2,
  position: {
    start: { line: 36, column: 2, offset: 2473, _index: 15, _bufferIndex: 1 },
    end: { line: null, column: null }
  },
  source: 'micromark-extension-mdx-jsx',
  ruleId: 'unexpected-character'
}

Page 'docs/product/accounts/quotas/index' or one/multiple of its page sections failed
[26:2: Unexpected character `!` (U+0021) before name, expected a character that can start a name, such as a letter, `$`, or `_` (note: to create a comment in MDX, use `{/* text */}`)] {
  reason: 'Unexpected character `!` (U+0021) before name, expected a character that can start a name, such as a letter, `$`, or `_` (note: to create a comment in MDX, use `{/* text */}`)',
  line: 26,
  column: 2,
  position: {
    start: { line: 26, column: 2, offset: 2276, _index: 12, _bufferIndex: 1 },
    end: { line: null, column: null }
  },
  source: 'micromark-extension-mdx-jsx',
  ruleId: 'unexpected-character'
}

Page 'docs/product/accounts/quotas/manage-attachments-quota' or one/multiple of its page sections failed
[19:36: Could not parse expression with acorn: Unexpected character '#'] {
  reason: "Could not parse expression with acorn: Unexpected character '#'",
  line: 19,
  column: 36,
  position: {
    start: { line: 19, column: 36, offset: 1770 },
    end: { line: null, column: null }
  },
  source: 'micromark-extension-mdx-expression',
  ruleId: 'acorn'
}

Am I doing something wrong by passing the content of these mdx files directly in like this?

// fromMarkdown is where the errors occur

const mdxTree = fromMarkdown(content, {
    extensions: [mdxjs()],
    mdastExtensions: [mdxFromMarkdown()],
});

Sentry's documentation is written using valid MDX.

As an example, here's the line of the error of the first file listed:

### Alert Notifications {#31-alert-notifications}

By default, Sentry will notify you about errors in your apps...

It's that first line. The header with the braces in it. This is valid markdown, but fromMarkdown doesn't like it.

Am I missing a step it the conversion process? Should I be using more/different extensions/mdastExtensions?

Thank you!

wooorm commented 1 year ago

Sentry's documentation is written using valid MDX.

Presumably MDX 1. See the announcement and migration docs for MDX 2 here: https://mdxjs.com/blog/v2/

This is valid markdown, but fromMarkdown doesn't like it.

It’s markdown. It’s not MDX. Because #31 is not valid JavaScript. You can use an escape: \{#31-alert-notifications}. You can also use valid JavaScript: {/* 31-alert-notifications */}.

stevensacks commented 1 year ago

Assuming I can't change the content itself (Sentry is in charge of their own docs), do you have any suggestions how I might be able to "massage" the content before I pass it to fromMarkdown()?

Is there a script that can convert mdx1 to mdx2?

stevensacks commented 1 year ago

I figured out a solution that works for Sentry, at least.

const fixBraceClauses = (content: string) =>
    content.replaceAll(/{#([^}]+)}/g, '{/* $1 */}');

const removeMarkdownComments = (content: string) =>
    content.replaceAll(/\s*<!--[\S\s]*?-->\s*/g, '');

const sanitizeContent = (content: string) =>
    fixBraceClauses(removeMarkdownComments(content));

const mdxTree = fromMarkdown(sanitizeContent(rawContent), {
    extensions: [mdxjs()],
    mdastExtensions: [mdxFromMarkdown()],
});
wooorm commented 1 year ago

Nice you found something that works for you!

Is there a script that can convert mdx1 to mdx2?

No. If that was possible, MDX 2 would not be needed: the entire point of the grammar changes in MDX 2 is to allow tools to parse the entirety of MDX and understand what everything is, which wasn’t possible before, but do allow such scripts in the future.