Closed revolunet closed 3 years ago
If HTML is peeking through, it sounds like https://github.com/micromark/micromark-extension-mdx-md isn‘t used, which turns it off?
For your getTooltipElement
function, I suggest to also build a node: https://github.com/syntax-tree/mdast-util-mdx-jsx#mdxjsxtextelement
fixed the getTooltipElement
thanks.
Looks like it works by adding mdxMd
to fromMarkdown.extensions :
const tree = fromMarkdown(content, {
extensions: [syntax({ acorn: acorn }), mdxMd],
mdastExtensions: [mdxJsx.fromMarkdown]
});
this is wonderful and kind of magic, thank you 💙
so this is indeed a feature and not a bug :)
HTML is a black box to markdown, with MDX, less so. But I thought I’d split the adding JSX vs removing HTML parts, it’s a bit confusing but at this level I thought it wise to split it!
Sounds interesting, what you’re doing with MDX at such a low level btw!
You can also use all of MDX (agnostic to JS) with https://github.com/micromark/micromark-extension-mdx, and gnostic to JS with https://github.com/micromark/micromark-extension-mdxjs!
Thanks for the explanation, it helps.
Our current use case is adding "tooltip" capabilities to various content, based on a glossary, so we just need to "wrap" these texts with some specific container. This will be used client-side to trigger the pretty tooltip.
What’s the reason for going full on JSX, instead of something more “markdown-esque” (not necessarily better, just, different): https://github.com/remarkjs/remark-directive
Well for some reason our users can author MDX content, that mix text and components; we went that way because using components sounded natural in the mdx/react ecosystem, and we didnt even think about markdown directives :)
ahh okay, yeah MDX is good for devs! :+1:
in some cases, we get full HTML as input, can i use the same process as MDX to parse and replace ?
For example, in this HTML, the 2 keyword
in the first paragraph dont get replaced while the next ones are well replaced 🤔
<p><a href="#">The keyword</a>blabla et une keyword de plus.</p>
<p><strong>keyword</strong></p>
<p>another keyword</p>
here's a sandbox with the repro : https://codesandbox.io/s/addglossary-in-mdx-zxyzw?file=/src/index.test.js
Is in HTML, or JSX?
Some extra info that might help here: There is also quite a bit of difference between mdx-js/mdx 1, mdx-js/mdx 2 (latest beta), and mdx-js/mdx (main branch, which is similar to https://github.com/wooorm/xdm). The first treats html/jsx more like how markdown sees it: a “black box”, meaning: it’s one HTML node, not separate tags with content in between. The rest parse it as separate tags with content in between.
Regardless, you might be better off doing this in rehype space, so the HTML (JSX?) is already parsed? And I’d suggest to build a node instead of injecting a string of html (https://codesandbox.io/s/addglossary-in-mdx-zxyzw?file=/src/glossary.js:1092-1181), if that’s possible?
Hi,
I that case this is HTML; so i understand we should better use a different "pipeline" based on input type : MDX or HTML ? I had some hope we could have the same parser but we can differentiate inputs if this makes things easier.
About adding a node instead adding a string of html, i did it but met an unexpected issue. will make a repro for that
The pipelines might be different, but I’m guessing that a lot of plugins can be reused? Including this plugin, if it’s working on hast, it can be used in both places.
If the input is different, having different pipelines seems the best way to go, to me, otherwise you’re going to run into problems with style="..."
being incorrect in React, but style={...}
being incorrect in HTML, <img>
without the closing slash, and similar problems.
And do let me know about that other issue!
I've updated the codesandbox and simplfied the test cases to use a mdxJsxTextElement
as replacement but for some reason terms are sometimes replaced twice 🤔
Looks like a bug!
Here’s a reduces test case:
const findAndReplace = require('mdast-util-find-and-replace')
const toMarkdown = require('mdast-util-to-markdown')
function replace(match) {
return {type: 'strong', children: [{type: 'text', value: match}]}
}
const tree = {type: 'paragraph', children: [{type: 'text', value: 'a b'}]}
findAndReplace(tree, [['never matches'], ['b', replace]])
console.log(toMarkdown(tree))
Which yields:
a ****b****
Note that the doubling only occurs when [b, replace]
is not the first pattern.
And it doesn’t occur if a
is removed
Parsing MDX containing HTML
Hi,
Not sure this is a bug or a feature;
I'm trying to parse MDX which can contain some HTML. At the end i'm looking to use mdast-util-find-and-replace to replace some text terms in the markdown.
While trying to parse some markdown with mdast-util-from-markdown, i find a difference when the markdown starts with HTML tags
When parsing
Some <b>HTML</b>content
the tree looks correct and the<b>
is amdxJsxTextElement
.mdast-util-find-and-replace
can then operate on thetext
element.When parsing
<p>Some HTML content</p>
, i only get a singlehtml
element containing the<p>
tags, and in that casemdast-util-find-and-replace
is unable to replace the textThere is a full repro with failing test case here for illustration : https://codesandbox.io/s/addglossary-in-mdx-zxyzw?file=/src/index.js
Expected behavior
Even starting with html tags, the MDX should be fully parsed
Actual behavior
The markdown is parsed as a single html node