2i2c-org / upstream

A place to keep track of upstream issues that we'd like to work on.
0 stars 0 forks source link

Implement a MyST parser for Nikola #14

Open choldgraf opened 3 years ago

choldgraf commented 3 years ago

Background

In order to learn a bit more about the MyST parser, it maybe a useful and insightful exercise to first attempt at utilizing the MyST parser outside of the Jupyter Book context. @damianavila and I discussed using Nikola as a test-case, since Damian is familiar with Nikola, and also because "blogging functionality" is a highly-requested feature in Jupyter Book.

Update (by Damián)

@damianavila explored the space and in ongoing conversations with @choldgraf, we agreed we may publish some blog post storytelling the insights we gained in this exploration as a way to further push some discussion about fundamental pieces on the Myst ecosystem.

The current outline for the proposed blog post lives here: https://github.com/2i2c-org/external/issues/14#issuecomment-885988780

Steps:

damianavila commented 3 years ago

OK, some written points about the research I have made so far...

  1. There is already some "primitive" (aka basic) support for Myst in Nikola in the form of a plugin
  2. I call it "primitive" support because it is using the Myst Parser Python API to build the HTML output: https://github.com/getnikola/plugins/blob/master/v8/myst/myst.py#L66
  3. The "primitive" support is actually not that useful as it is explained in the following issue on the Myst Parser repo. Basically, it is not rendering roles nor directives because you need Sphinx for the full rendering (as outlined by Chris S).
  4. Going deeper, the to_html function is using the default parser which, in turn, is using a HTML renderer coming from markdown-it-py project: https://github.com/executablebooks/markdown-it-py/blob/master/markdown_it/renderer.py#L36
  5. It seems it is possible to have a custom HTML render on top of the markdown-it-py one, but is this the proper place to add support for roles and directives independent from Sphinx? @choldgraf suggested a proper place should be instead in the Myst Parser itself, maybe with a more powerful HTML renderer (independent from Sphinx and supporting roles and directives somehow)? Building better Sphinx-independent HTML rendering support is one way to make progress for this use case or...
  6. Alternatively... we could leverage the Nikola built-in rst support and use that to build the wanted HTML. Nikola uses a docutils publisher to render HTML from rst-based post. We should be able to somehow build/improve the current Nikola myst plugin to use the Myst Parser docutils renderer and tunnel the output into the Nikola docutils publisher. Then Nikola should be able to render roles and directives coming from Myst-based posts, hopefully... unless I am missing something else (ie. difficulties pushing stuff into the Nikola publisher among other things).

@choldgraf, thoughts are welcome :wink: (hopefully this makes more sense now that is written)

choldgraf commented 3 years ago

Nice - thanks for this write up! My intuition is that the most useful next steps would be to improve the to_html story so that we got some slightly better role/directive support, even if it was just something like this as the output:

<div class="directive <directivename>">
  <p><Directive content></p>
</div>

at least that way the output could be styled etc. Over time the mapping from specific directives to HTML could be extended somehow (e.g., maybe somebody could define a Directive class and their own mappings, and if the class doesn't exist it falls back to defaults?)

If you think that going the docutils route would be much simpler, then that's not a bad approach, I just feel like the direct-to-html approach could be a more straightforward way in the long run since docutils can be difficult to work with.

This could also be a good opportunity to improve the myst-parser documentation to explain some of this a bit more.

damianavila commented 3 years ago

Thanks for the feedback, @choldgraf! I am exploring both paths (docutils and direct-to-html) but I will focus more on the direct-to-html as your intuition indicates.

choldgraf commented 3 years ago

Does that match your intuition as well? 🙂

damianavila commented 3 years ago

It depends... 😜 The docutils path gives you a quick exposure into the Nikola ecosystem whereas the direct-to-html path gives you a more fundamental answer improving the myst parser. I am not sure which one is actually tactically more efficient and this is why I started pushing on both because we may be "chaining" both. Starting with the one giving you a quick but measurable improvement (docutils) and present the other (direct-to-html) as one step further to improve the overall situation. But that chain should be "enough" natural so we do not get trapped into the docutils one. That link is what I am trying to figure it out now...

damianavila commented 3 years ago

OK, I have dived into the direct-to-html path a little bit more... some thoughts for discussion...

It seems it is possible to have a custom HTML render on top of the markdown-it-py one, but is this the proper place to add support for roles and directives independent from Sphinx? @choldgraf suggested a proper place should be instead in the Myst Parser itself, maybe with a more powerful HTML renderer (independent from Sphinx and supporting roles and directives somehow)?

I think I disagree with @choldgraf here about where the support for roles and directives should potentially live... let me elaborate...

The to_html function, as I have said before, it is using the default_parser. And the default_parser is using the markdown-it-py based RendererHTML. That is somehow "hardcoded" on the myst parser. You can not easily use a custom renderer there unless you make the html renderer (to be used) somehow configurable.

But the fact it is not configurable makes sense when you look at the core of the parsing process: https://github.com/executablebooks/MyST-Parser/blob/master/myst_parser/main.py#L158-L227

There, you can see the parser has a chained markdown-it-py object. A chain composed of multiple markdown-it-py plugins on top of the base object. IMHO, the design is suggesting to add new functionality as a markdown-it-py plugin and then use the base RendererHTML that should know how to render that properly.

This is in fact in alignment with the recently created markdown-it-docutils which is, IIUC, a markdown-it plugin that provides support for roles and directives independent from docutils/sphinx, but all of that in the JS world instead of the Py one.

So, eventually, there should be a markdown-it-py plugin providing the same functionality and then it will be just a matter to chain that plugin only if the renderer configured is "html".

    if config.renderer == "html":
        md.use(docutils_plugin)

Or, alternatively, if we do not want to modify the myst parser, we just recreate it and specifically add the non yet created docutils plugin (related: https://github.com/executablebooks/MyST-Parser/issues/348).

So, in summary, I think any sort of basic support for directives as @choldgraf indicate above should be implemented as a markdown-it-py plugin.

Thoughts? Since @choldgraf suggested any implementation of directives should live in the myst parser, I am worried I am misunderstanding something basic here...

choldgraf commented 3 years ago

I was looking into the MyST Parser and markdown-it-py code as well and came to the same conclusion on my flight haha. I think you're right that we should look at the https://github.com/executablebooks/markdown-it-docutils project for inspiration about how we'd accomplish the same thing on the python side. In general, we should assume that the JS and Python implementations of MyST will behave very similarly to one another so this makes sense to me.

damianavila commented 3 years ago

Note: I have deleted a duplicated comment.

I was looking into the MyST Parser and markdown-it-py code as well and came to the same conclusion on my flight haha.

That's great! I have started to look into markdown-it-py and their plugins internals...

I think you're right that we should look at the https://github.com/executablebooks/markdown-it-docutils project for inspiration about how we'd accomplish the same thing on the python side.

Yep, I have started to look into that one too... but I want to see it a little bit more consolidated before deep diving into it.

In general, we should assume that the JS and Python implementations of MyST will behave very similarly to one another

Yep, totally...

so this makes sense to me

Super! Thanks for confirming I was not missing something big, je je :wink:

damianavila commented 2 years ago

Update:

We had agreed with @choldgraf about writing a series of blogposts to navigate the Nikola-Myst story and the underlying issue about how to natively support roles and directives in Myst without docutils/sphinx intervention. The ultimate idea is to showcase the problem, bring attention from the community, and foster the discussion about the topic.

I have started on the content of those blog posts and the general outline (that needs to be split on multiple blog posts) would be something like this (feedback welcome):

Myst support for Nikola

Current support

Existing Myst plugin for Nikola

Problems with current support

Pros and Cons/Limitations

Possible Alternatives

Python side

JS side

Myst support for roles and directives

How it is now...

Why do we want to go independent from docutils/sphinx?

Discuss possible implementations

Some roles and directives on Nikola via Myst

Showcase a simple role or directive implementation via the Python API.

Showcase the JS alternative through templating via markdown-it-docutils (and maybe markdown-it-myst?)

damianavila commented 2 years ago

@choldgraf I have updated the first message to create some child tickets addressing each part of the proposed layout I shared above.