DavidAnson / markdownlint

A Node.js style checker and lint tool for Markdown/CommonMark files.
MIT License
4.82k stars 733 forks source link

Support for MDX? #985

Closed JustinBeckwith closed 1 year ago

JustinBeckwith commented 1 year ago

I don't see this explicitly mentioned anywhere - does markdownlint support MDX?

nschonni commented 1 year ago

https://github.com/DavidAnson/markdownlint/issues/723#issuecomment-1434017183

DavidAnson commented 1 year ago

Hi, @JustinBeckwith! CommonMark specification and more recently parts of GFM.

JustinBeckwith commented 1 year ago

Are you open to the idea of supporting it? We've started playing with MDX, and it looks like we're heading that direction with our docs:
https://github.com/discord/discord-api-docs/

I'm finding a lot of modern doc generation platforms like https://docusaurus.io are starting to pick it up too. I may be down to help with implementation if the idea isn't offensive :)

DavidAnson commented 1 year ago

It’s unfortunately a little complicated. :) At the core is the parser. Originally, this project used markdown-it. For a few months now, I’ve been transitioning rules over to micromark because it has better positional information that allows better detail and fixes. The good news is that micromark seems to have MDX support (via extensions the same as GFM, https://github.com/micromark/micromark#list-of-extensions). However, some rules can’t use the parser because they try to detect syntax issues that, by definition, don’t parse correctly. Depending on how weird MDX is, existing rules may “just work” with MDX or may need a lot of updating. I am on my phone with bad connectivity, so can’t look at the specification right now (https://mdxjs.com/). But I will look into this soon. Give me a week, please. In the meantime, if you can identify a handful of typical documents to play with, that would be helpful. It would also be interesting to understand if you can lint the output (of MDX compilation) instead of the input because that works today and will catch a variety of issues that could be introduced during compilation.

DavidAnson commented 1 year ago

I read a little bit about MDX this evening, and the situation is worse than I realized. :) For one, MDX is not a superset of Markdown in the same way that GFM is. This means you can't do a single parse and have the output be usable for both Markdown and MDX scenarios. For another, MDX does not output Markdown like I expected. This means it is not practical to lint the output instead of the input. Another thing is I don't love the idea of running uncontrolled user code as part of the parse. This leads to unpredictable runtime behavior and possible hangs.

Im still learning, but I'm really not sure how I feel about this... I understand the intent behind MDX, but the more a document uses it, the more it turns into React.js code, and the less meaningful it feels to try to lint it like a text document. For example, what does it even mean if an MDX component outputs something goofy like a heading that skips levels (H1->H6)? You won't even know that from looking at the document - only if you render it all the way to the final HTML product and then analyze THAT as HTML... Which is pretty far out of scope for this tool.