executablebooks / MyST-Parser

An extended commonmark compliant parser, with bridges to docutils/sphinx
https://myst-parser.readthedocs.io
MIT License
738 stars 196 forks source link

mdx interop #266

Open mgielda opened 3 years ago

mgielda commented 3 years ago

First up, thank you for an awesome framework! I have been looking for this kind of thing for so long, being a huge Sphinx fan but also liking the simplicity of Markdown.

Is your feature request related to a problem? Please describe.

The fragmentation of the markup languages ecosystem is indeed frustrating. For years I was torn between md and rst, leaning towards rst because of superior features.

And then MyST came along. However, it did so more or less when mdx also became huge.

I am torn between jumping on the MyST bandwagon, and doing something based on mdx, which has also seen some incredible progress with docusaurus etc. Currently using mdx since the MyST JS parser did not work for me as expected (I mean not the JS parser but the HTML renderer). I think there exists solution to this problem where it would be possible to have a cake and eat it.

Describe the solution you'd like

It would be splendid if MyST came with a flavor that just used HTML tags for roles/directives. Simply speaking, inline HTML (ends in the same line) could be treated as a role, while a separate block (multiple lines before the closing tag) would be a directive. Attributes are pretty obvious for directives, just use HTML attributes (dunno how to handle attributes on roles, which do not exist in RST, but I guess you could just ignore them, or somehow make them work later).

This would be super cool because it would essentially be fully compatible (perhaps bar some niche things) with the MDX syntax and - if you just implement the relevant MDX classes in some kind of JS parser - they would just work (TM).

So, throwing the idea out there and curious of feedback.

I don't want to start my own yet-another markup ecosystem ;-) MyST is great since it comes with Sphinx infrastructure etc. Having a way to input MDX-like syntax would be pretty incredible, because it could lure a lot of people to use MyST. Also it might save time in developing a JS parser at all, since MDX parsers exist?

Describe alternatives you've considered

As above. I have considered using MDX, but having interoperability with Sphinx is a must for me.

welcome[bot] commented 3 years ago

Thanks for opening your first issue here! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.
If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).
Welcome to the EBP community! :tada:

chrisjsewell commented 3 years ago

Thanks. Yeh I'd never heard of mdx, until now, so I'll check it out 😀

mgielda commented 3 years ago

Wow, really? It's popular, supported in Gatsby, Docusaurus... generally well integrated into the JS ecosystem, which is small wonder given that it stems from React/jsx.

But until some time ago I had not heard of MyST, so it's pot calling kettle black ;-)

choldgraf commented 3 years ago

That does look interesting! I hadn't heard of it either so thanks for sharing. I think that the web dev community and the scientific / python community are pretty different so not surprising. The one thing that gives me pause is that we are intentionally not including any web-specific syntax in myst (as myst can be built for a variety of output types), so I think if this would be supported it would be via optional syntax extensions, similar to how admonition syntax can be triggered with :::. I think the first step to making that happen would be to prototype rules for markdown-it-py that would select on the proper mdx syntax and spit out the resulting roles/directives. Is that something you are interested in trying @mgielda?

mgielda commented 3 years ago

Thanks for the quick feedback @choldgraf - makes sense indeed. Time and time again I have seen innovation coming from the front end community which looks really pretty (generally speaking web components / React / interactive development stuff) but is not focused on complex data-oriented scenarios or complex structured data that is not HTML, and great innovation in the jupyter /Sphinx/Python etc. community which in turn does not plug in very well into more web-oriented frameworks. I have always wanted to bring the two closer together somehow, and documentation is a great connection point.

Summing up, yes, I am absolutely interested in building something like this. Of course there might be holes I am not seeing currently but I think it's worth looking into.

For context, we are a reasonably sized team working with a bunch of clients using RST/Sphinx for documenting lots of pretty varied stuff (hardware/software/FPGA tools etc). On the other hand nobody at my company is working on strictly that part of the infrastructure at the moment, it's mostly my personal quest, so I would now need to find someone to go and implement this. I am too involved in the day to day operations to be trusted to implement this myself ;) although I am very good at breaking things and coming up with wacky ideas.

Having said that, if you point me to where in markdown-it-py those rules should go and what other implementations I could rip stuff off, I might be able to create something that looks horrible but works and then get someone with more Python skills than me to review it and do a pull request / package.

choldgraf commented 3 years ago

I think there are two things that might help figure out how this works:

looking at the addition of the "admonition syntax" (:::)

Here's the PR that adds it to myst-parser: https://github.com/executablebooks/MyST-Parser/pull/201/

In particular I think its enabled in markdown-it-py here: https://github.com/executablebooks/MyST-Parser/pull/201/files#diff-d4dffeb253a1611ee66838672d2f17124ed7f57c4a49d13f008e706e5ae01d51R54

and I believe this is the code in markdown-it-py that defines the "container plugin" (which is what "admonition syntax" uses: https://github.com/executablebooks/markdown-it-py/blob/31f426bd9649a4f9ad4753e2aa2060d33de5bd9f/markdown_it/extensions/container/index.py#L10

Looking at how directives are used in MyST parser

Directives are docutils-specific things, so we piggy-back on the markdown-it-py code fence block to find them, and output them in the docutils-renderer. Here's where that happens: https://github.com/executablebooks/MyST-Parser/blob/master/myst_parser/docutils_renderer.py#L384

And here's the render_directive method that actually runs the block as a docutils directive: https://github.com/executablebooks/MyST-Parser/blob/master/myst_parser/docutils_renderer.py#L831

So to support MDX syntax, you'd need to:

  1. Add block-level syntax in markdown-it-py for the MDX syntax.
  2. Add logic that turns MDX blocks into directive calls (I'm not sure how exactly this would be done in myst-parser, maybe @chrisjsewell could advise there...or maybe this is self-contained enough that it should be its own Sphinx package)
mgielda commented 3 years ago

Thanks @choldgraf - this is super useful. Now I am actually tempted to do this ;) It seems like I can hook into render html, and if the tag name is capitalized, I make it a directive. I would need to tweak the function a bit to parse the code block differently but hopefully it's not superhard.

Then we need a way to parse roles, but I would do it as step 2.

choldgraf commented 3 years ago

Then we need a way to parse roles, but I would do it as step 2.

I think this is relevant to that question: https://github.com/executablebooks/MyST-Parser/issues/69

Would love thoughts or opinions!